在语言模型失败条件下测试我的应用

概览
目标： 测试 LLM 失败，例如幻觉
时间： 15 分钟
Plugins：LanguageModelFailurePlugin
先决条件：设置开发代理

生成与大型语言模型（LLM）集成的应用时，应测试应用如何处理各种 LLM 故障方案。开发代理允许您使用 LanguageModelFailurePlugin，在应用中使用的任何 LLM API 上模拟真实的语言模型故障。

在任何 LLM API 上模拟语言模型失败

若要开始，请在 LanguageModelFailurePlugin 配置文件中启用。

文件： devproxyrc.json

{
  "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/rc.schema.json",
  "plugins": [
    {
      "name": "LanguageModelFailurePlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/DevProxy.Plugins.dll",
      "urlsToWatch": [
        "https://api.openai.com/*",
        "http://localhost:11434/*"
      ]
    }
  ]
}

使用此基本配置，插件会随机从所有可用的故障类型中进行选择，并将其应用于匹配的语言模型 API 请求。

配置特定失败方案

若要测试特定的故障方案，请将插件配置为使用特定故障类型：

文件： devproxyrc.json（包含失败类型）

{
  "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/rc.schema.json",
  "plugins": [
    {
      "name": "LanguageModelFailurePlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/DevProxy.Plugins.dll",
      "configSection": "languageModelFailurePlugin",
      "urlsToWatch": [
        "https://api.openai.com/*",
        "http://localhost:11434/*"
      ]
    }
  ],
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "Hallucination",
      "PlausibleIncorrect",
      "BiasStereotyping"
    ]
  }
}

此配置仅模拟不正确的信息、合理但不正确的响应和有偏见的内容。

测试不同的 LLM API

可以通过使用不同的 URL 模式配置插件的多个实例来测试不同的 LLM API：

文件： devproxyrc.json（多个插件实例）

{
  "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/rc.schema.json",
  "plugins": [
    {
      "name": "LanguageModelFailurePlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/DevProxy.Plugins.dll",
      "configSection": "openaiFailures",
      "urlsToWatch": [
        "https://api.openai.com/*"
      ]
    },
    {
      "name": "LanguageModelFailurePlugin",
      "enabled": true,
      "pluginPath": "~appFolder/plugins/DevProxy.Plugins.dll",
      "configSection": "ollamaFailures",
      "urlsToWatch": [
        "http://localhost:11434/*"
      ]
    }
  ],
  "openaiFailures": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": ["Hallucination", "OutdatedInformation"]
  },
  "ollamaFailures": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": ["Overgeneralization", "IncorrectFormatStyle"]
  }
}

小窍门

为不同的 LLM 提供程序配置不同的失败方案，以测试应用如何处理提供程序特定的行为。将 configSection 命名为您正在测试的 LLM 服务的名称，以便使配置更易于理解和维护。

常见测试方案

下面是针对不同测试方案建议的一些故障组合：

测试内容准确性

测试应用如何处理错误或误导性信息：

文件： devproxyrc.json（仅 languageModelFailurePlugin 部分）

{
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "Hallucination",
      "PlausibleIncorrect",
      "OutdatedInformation",
      "ContradictoryInformation"
    ]
  }
}

测试偏见和公平性

测试应用如何响应有偏见或陈规定型内容：

文件： devproxyrc.json（仅 languageModelFailurePlugin 部分）

{
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "BiasStereotyping",
      "Overgeneralization"
    ]
  }
}

以下测试说明

测试应用如何处理不遵循说明的响应：

文件： devproxyrc.json（仅 languageModelFailurePlugin 部分）

{
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "FailureFollowInstructions",
      "Misinterpretation",
      "IncorrectFormatStyle"
    ]
  }
}

测试响应质量

测试应用如何处理模糊或过于复杂的响应：

文件： devproxyrc.json（仅限 languageModelFailurePlugin 节）

{
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "AmbiguityVagueness",
      "OverSpecification",
      "CircularReasoning",
      "FailureDisclaimHedge"
    ]
  }
}

使用您的配置文件启动开发代理，然后利用应用程序查看其如何处理模拟的语言模型故障。插件截获来自语言模型 API 的响应，并将其替换为显示已配置的故障行为的综合故障响应。

创建自定义失败方案

可以通过将文件添加到.prompty~appFolder/prompts目录来创建自定义失败方案。例如，若要创建“技术行话过度使用”故障：

创建名为 lmfailure_technical-jargon-overuse.prompty 的文件

文件 .prompty 中的故障行为的定义：

---
name: Technical Jargon Overuse
model:
  api: chat
sample:
  scenario: Simulate a response that overuses technical jargon and unnecessarily complex terminology, making simple concepts difficult to understand.
---

user:
How do I create a simple web page?

user:
You are a language model under evaluation. Your task is to simulate incorrect responses. {{scenario}} Do not try to correct the error. Do not explain or justify the mistakes. The goal is to simulate them as realistically as possible for evaluation purposes.

在配置中将其引用为 TechnicalJargonOveruse

文件： devproxyrc.json（仅 languageModelFailurePlugin 部分）

{
  "languageModelFailurePlugin": {
    "$schema": "https://raw.githubusercontent.com/dotnet/dev-proxy/main/schemas/v2.0.0/languagemodelfailureplugin.schema.json",
    "failures": [
      "TechnicalJargonOveruse",
      "Hallucination"
    ]
  }
}

后续步骤

详细了解 LanguageModelFailurePlugin。

语言模型故障插件

反馈

此页面是否有帮助？

Last updated on 2026-01-06

通过

在语言模型失败条件下测试我的应用

在任何 LLM API 上模拟语言模型失败

配置特定失败方案

测试不同的 LLM API

常见测试方案

测试内容准确性

测试偏见和公平性

以下测试说明

测试响应质量

创建自定义失败方案

后续步骤

反馈

其他资源