Share via


Enhance agent testing with Copilot Studio Kit

The Power CAT Copilot Studio Kit is a user-friendly application that lets you verify agent responses. It also includes native capabilities like Excel export and import for bulk creation and updates.

Configure, run, and analyze

Configure and run tests against the Copilot Studio APIs (Direct Line API) to evaluate agent responses against expected results.

To enrich results, retrieve additional data points from Azure Application Insights and Dataverse by analyzing conversation transcript records (such as the exact triggered topic name and intent recognition scores).

For AI-generated answers, which are nondeterministic by nature, use prompts to compare the generated answer with a sample answer or validation instructions.

Diagram that shows Azure and Power Platform components involved in testing and analysis of Copilot Studio Direct Line APIs, including Azure Application Insights, AI Builder, and Dataverse.

Test types

The tool supports these types of tests:

  • Response match
  • Attachments such as adaptive cards
  • Topic match (requires Dataverse)
  • Generative answers (requires AI Builder for response analysis and Application Insights for details on why an answer wasn't generated)
  • Multi-turn test type is a special test type. It consists of a set of test cases of regular types that run in a specified order in the same conversation context. Use multi-turn tests to test scenarios end-to-end, and for testing custom agents with generative orchestration.
  • Plan validation allows makers to validate that their custom agents that use generative orchestration include the expected tools. Rather than evaluating what the agent says, this test type checks that the agent's dynamic plan includes the expected tools (tools, actions, and connected agents) to a pre-determined threshold.

Learn more about test types in Configure tests in Copilot Studio Kit.

Screenshot of test run result details, including a graphic showing success rate and latency for all test runs.

Next step