Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
[This article is prerelease documentation and is subject to change.]
By using the results from the test set, you can optimize your agent's behavior and validate that your agent meets your business and quality requirements. You can also run test sets multiple times to compare results as you improve your agent.
Test results are available in Copilot Studio for 89 days. To save your test results for a longer period, export the results to a CSV file.
Important
This article contains Microsoft Copilot Studio preview documentation and is subject to change.
Preview features aren't meant for production use and may have restricted functionality. These features are available before an official release so that you can get early access and provide feedback.
If you're building a production-ready agent, see Microsoft Copilot Studio Overview.
Run a test set
After you create a test set, you can run or rerun it to compare results over time and iterations.
Important
Agent evaluations that use user authentication require access through the Microsoft Copilot Studio connector. If your admin turns off this connection, you can't run tests by using the evaluation tool. For more information, see Copilot Studio connectors and data groups.
Go to your agent's Evaluations page.
Run a test by doing one of the following actions:
At the end of creating or editing a test set, select Evaluate.
Find the test set in the Test sets list, then select the More icon (…) > Evaluate test set.
Hover over a test result that uses the set you want to use, then select the More icon (…) > Evaluate test set again.
If the user profile for the test set has broken connections, or the test set doesn't have a user profile, the Manage connections dialog appears. You don't have to use a user profile for testing. However, if you do use a profile, all the connections must be working. For information on fixing connections, see Manage user profiles and connections.
An evaluation can take a few minutes to run. An alert appears in Copilot Studio when the test results are ready to view.
Dive into test results
Each time you run an evaluation with a test set, Copilot Studio:
Uses the connected user account to simulate conversations with the agent, sending each question in the test case to the agent.
Collects the agent's responses.
Measures and analyzes the success of each response. Each test case receives a Pass or Fail, based on the criteria of the test case.
Assigns a Pass rate score based on the Pass/Fail rate of the test set.
You can see the Pass rate of each test set run on your agent's Evaluation page, under Recent results. To see more test set runs, select See all.
See a detailed analysis for a test case
When you open a test result, you can see the details of the test run, a list of the queries used in the test, how the agent responded, and the Pass or Fail score.
Select a test case in the list to see a detailed assessment of each response.
The assessment includes the expected and actual responses, the reasoning behind the test result, and the knowledge, topics, and tools the agent used to respond.
Select a cited knowledge or topic to open it.
Compare test results
You want to test one version of your agent and see changes in performance before and after you make changes. You can compare two runs of the same test set by using the Comparison with tool.
To see a comparison, you need to run the same test set at least twice.
In your agent's Evaluation page, open the test run you want to use as a base for the comparison, under Recent test results.
Select the Compare with dropdown, then select the time and date of the test run you want to compare with the currently open test results.
In the Test cases list, arrows show which test case results improved by changing from failing to passing
, or declined by changing from passing to failing
.
Select a test case to see more details. In the Evaluation summary pane, you can see a direct comparison of test scores, with the current test run's result on top.
Export test results
You can export test results to a CSV file. The file lists the question, expected response (if applicable), test method, passing score (if applicable), the agent's response, the test result, and analysis for each test case.
On your agent's Evaluations page.
Select the results you want to export.
In the Evaluation summary pane, select the more icon (…) > Export test results.
The test results download as your test set name.csv.