Copilot Studio Agent: Why Does It Start With the Right Answer and Then Switch Mid-Response?

Question

Copilot Studio Agent: Why Does It Start With the Right Answer and Then Switch Mid-Response?

Amine Mekki 10

Hi everyone,

I’ve built an agent in Copilot Studio that uses SharePoint and a website as its knowledge base. The agent uses generative orchestration for its flow.

Here’s the issue: Sometimes, when I ask a question, the agent starts answering correctly (I can see the right answer being streamed), but then mid-response it switches and gives a wrong or incomplete final answer. In the test panel, I notice this happens when conversational boosting kicks in.

From what I understand, conversational boosting is a fallback when orchestration fails. But I don’t know why orchestration fails in these cases, especially since it starts off correctly.

My questions:

Why does the agent begin with the correct answer and then switch mid-response?
What causes generative orchestration to fail and trigger conversational boosting?
How can I prevent this fallback and ensure the agent sticks to the orchestrated answer?

For context, here’s an example of my system prompt (company name anonymized as X and document types as Y and Z):

# PURPOSE
Your mission is to answer users’ questions about X using Y and Z documents.

# RESPONSE CONTRACT
- Tone: Professional, clear, and concise.

# RESPONSE FORMAT
1. Answer:
   - Provide a clear answer relevant to the question (do not write “Answer:” as a label).
2. Source:
   - Include excerpts that were used to generate the answer.
3. Disclaimer:
   - Always include:
     - *This response was generated by an AI assistant based solely on X’s official Y and Z documents. Please verify the information provided by reviewing the cited sources, as this content was generated using AI and may require human validation.*

# EXAMPLES TO SIMULATE
User: "Here I give the agent an example of a question"
Your answer: Here I give the agent an example of an answer
Source:
- "here I give an example of the text chunk"

Disclaimer...

Has anyone experienced this behavior? Any ideas on why orchestration fails and how to avoid the fallback?

Thanks!

Sayali-MSFT 4,341 Reputation points Microsoft External Staff Moderator

2025-11-26T05:16:41.1766667+00:00

Hello **Amine Mekki,
**Thank you for bringing this issue to our attention. We will look into it and get back to you shortly.

2 answers

Your answer

Sayali-MSFT 4,341 Reputation points Microsoft External Staff Moderator

2025-11-26T05:16:41.1766667+00:00

Hello **Amine Mekki,
**Thank you for bringing this issue to our attention. We will look into it and get back to you shortly.

Answer 1

Sayali-MSFT 4,341 Microsoft External Staff Moderator

Hello Amine Mekki,
Generative orchestration in Copilot Studio may start with the correct, grounded answer but then switch mid-response when its confidence drops. When this happens, the system falls back to conversational boosting, which relies on broader model knowledge and can override the grounded content. Orchestration failure is usually caused by low-quality retrieval, ambiguous or complex prompts, irrelevant content chunks, or streaming fallback triggers.
To prevent this, improve retrieval structure, adjust confidence thresholds, simplify and clarify system prompts, test problematic queries, or disable conversational boosting. You can also instruct the agent to respond with a fallback message instead of generating ungrounded answers.
Reference Document: -

Amine Mekki 10 Reputation points

2025-12-01T08:53:54.13+00:00

Hello, thank you for your answer.

The system prompt we use is relatively simple and not overly complex for the LLM. Regarding retrieval structure, since our data source is SharePoint, we have no control over how the retrieval is configured. As far as I know, there is no exposed parameter to adjust the confidence threshold. Disabling conversational boosting would cause the agent to fall back to an error message, which might negatively impact the user experience.
Sayali-MSFT 4,341 Reputation points Microsoft External Staff Moderator

2025-12-05T09:51:13.8533333+00:00

Hello Amine Mekki, You’re correct—Copilot Studio currently does not expose a parameter to adjust the confidence threshold for retrieval when using SharePoint as a knowledge source. The retrieval pipeline is managed internally, and conversational boosting is part of the orchestration layer to improve relevance.
Copilot Studio’s confidence thresholds and ranking logic can’t be adjusted, and turning off conversational boosting makes the agent depend on strict retrieval, often causing unnecessary fallback responses.
To improve performance, enhance SharePoint content quality, use explicit topics for critical scenarios, combine knowledge with actions for low-confidence queries, and monitor analytics to identify and fix recurring issues.

Answer 2

I'm seeing this more in agents that were working fine using GPT-4o (now retired but can test to confirm behavior when "Get 30 additional days with your existing model before it becomes unavailable" is enabled) and are now using GPT-4.1 or greater.

The newer the model the more this issue occurs because:

GPT‑4o uses simpler orchestration logic and doesn’t aggressively override responses.
It respects connector outputs and system instructions more consistently.
It avoids the “second pass” behavior that GPT‑4.1 and GPT‑5x introduced for grounding and confidence scoring.

Stated another way, GPT‑4o works better in my scenarios because:

Deterministic answers: It doesn’t try to “improve” or reformat responses after the fact.
Lower orchestration complexity: No mid-response fallback injection.
Stable multi-channel behavior: Works in both M365 Copilot and Teams without escalation errors.

My solutions that use a SharePoint List as the knowledge source are experiencing this the most. Moving from relying heavy on Instructions to utilizing agent flows helps, or for now for simple agents just switching back to classic orchestration with trigger phrases.

Share via

Copilot Studio Agent: Why Does It Start With the Right Answer and Then Switch Mid-Response?

2 answers

Your answer