Azure AI Foundry Agent Service: Is There Any Way to Force Sequential Execution of Multiple Tools (Azure AI Search -> Bing Grounding -> Response)?
I have an Azure AI Agent with two tools attached (Azure AI Search and Bing Grounding). I need both tools to execute for every query, but cannot find a way to force this behavior consistently.
Environment:
- Service: Azure AI Agent Service
- SDK: azure-ai-projects (Python)
- Model: GPT-4.1
- Tools: Azure AI Search, Bing Grounding
Business Use Case:
Hybrid search requiring both internal knowledge base (Azure AI Search) and external web (Bing Grounding) results for every query to formulate final response.
What I've Tried:
Explicit system prompt instructions - Added detailed mandatory instructions stating both tools MUST be called sequentially with exact workflow steps.
tool_choice="auto" - Default behavior where the model decides which tools to use. No tool calls were made for the queries tested.
tool_choice={"type": "bing_grounding"}** - ONLY Bing Grounding tool called for the queries tested.
tool_choice={"type": "azure_ai_search"} Surprisingly, BOTH Azure AI Search AND Bing Grounding tools were called for the queries tested. Unclear if this is expected behavior or bug and can be consistently reproducible across different queries or environments.
Questions I have:
- Is there any way to force ALL attached tools to execute?
- Can I guarantee specific multiple tools are called (e.g., both Azure AI Search AND Bing Grounding)?
- Does tool selection override system prompt instructions?
- Is functionality like
tool_choice="all"ortool_choice=["tool1", "tool2"]planned? - Finally can the agent be forced to call a single tool (either Azure AI Search or Bing Grounding) consistently all the time by setting
tool_choice="required"or specifying the tool name withtool_choice={"type": "tool_name"}?
Azure AI services
-
Sridhar M • 2,675 Reputation points • Microsoft External Staff • Moderator
2025-11-12T17:16:08.4966667+00:00 Thank you for sharing your observation. Shall review with product group internally. Please share the details requested in private message.
-
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-11-13T12:40:24.7766667+00:00 Hello Mahesh Babaji Sondkar,
Welcome to Microsoft Q&A and thank you for reaching out.
I understand that you’re trying to ensure that both Azure AI Search and Bing Grounding tools execute sequentially for every query in your Azure AI Agent setup. Let’s go through the current capabilities, limitations, and possible workarounds.
At this time, the Azure AI Agent Service does not support enforcing sequential execution of multiple tools (for example, Azure AI Search → Bing Grounding → Response) within a single run. The agent’s orchestration is LLM-driven, meaning the model autonomously decides if, when, and which tools to call based on the context, system prompt, and tool definitions.
-
tool_choice="required"ensures at least one tool is called but does not guarantee which one or enforce an order. -
tool_choice={"type": "tool_name"}can be used to force a single specific tool to execute for that run. - Currently, there’s no option such as
tool_choice="all"ortool_choice=["tool1","tool2"]to enforce multiple sequential tool calls. This functionality is being considered for future updates.
Guidance on Tool Invocation
1. Tool Choice Parameter
The
tool_choiceparameter remains the most deterministic control available for tool invocation. However, it cannot enforce strict multi-tool or sequential execution within a single turn. Whentool_choice="required"is used, the model guarantees one tool will be called, but the agent still retains control over the sequence and may skip others based on contextual reasoning.If you want both tools to run deterministically, you can invoke them in separate runs, like this:
# Example: Forcing Azure AI Search first run = project_client.agents.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice={"type": "azure_ai_search"} # Call Azure AI Search first ) # After completion, trigger Bing Grounding run = project_client.agents.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice={"type": "bing_grounding"} # Call Bing Grounding next )This approach allows you to control the sequence externally, ensuring both tools run before generating a final model response.
2. Instructions Parameter
Make sure your agent’s Instructions (system prompt) explicitly describe your expected order of tool execution. Although this won’t enforce behavior programmatically, it can help guide the LLM’s reasoning process. For example:
“First, use Azure AI Search to retrieve internal knowledge base data. Then, use Bing Grounding to collect relevant external web information based on the Azure AI Search results before generating a final response.”
You can experiment with prompt phrasing to encourage consistent multi-tool invocation.
Recommended Workaround – External Orchestration
If your use case requires deterministic, sequential execution (i.e., both tools must always run in order), the most reliable solution is to orchestrate the sequence externally in your application logic:
Invoke Azure AI Search and collect internal results.
Invoke Bing Grounding (or external web search) using the search results as context.
Pass both sets of results to the model for synthesis and final response generation.
This approach guarantees consistent tool execution order and provides full control over auditing and debugging.
- Also Ensure you’re using the latest version of the
azure-ai-projectsSDK for up-to-date orchestration capabilities. - Review execution logs to check whether tool invocations are skipped or encountering runtime issues.
- Continue monitoring the Azure AI Foundry documentation and Azure Updates for future enhancements that may support multi-tool orchestration control.
Answers to Your Specific Questions
Force all attached tools to execute:
Not directly possible in a single run. You can, however, call them sequentially in separate runs.
Guarantee multiple tools (e.g., Azure AI Search and Bing Grounding) are called:
Not deterministically the model’s internal logic governs selection.
Does tool selection override system prompt instructions?
Yes,
tool_choicetakes precedence over prompt-level instructions.Is a
tool_choice="all"or similar feature planned?Not currently available, but it’s under consideration for future releases.
Force the agent to call a single tool consistently:
Yes, using
tool_choice={"type": "tool_name"}ensures that tool is invoked for that run, though behavior may vary in subsequent turns based on query context.Please refer this
Troubleshoot problems with agent tools integrationUse system instruction to help model invoke the right tool
I hope this helps, do let me know if you have any further queries.
If this answers your query, could you please take a moment to retake the survey by accepting this response? Your feedback is greatly appreciated.
Thank you!
-
-
Mahesh Babaji Sondkar • 1 Reputation point
2025-11-13T14:43:45.2833333+00:00 Thanks for your detailed response. It appears that there’s no way to force the agent to execute both tools (Azure AI Search → Bing Grounding) deterministically in a single run. I’ll call them sequentially in separate runs and will update you on how it goes.
I also wanted to share some experiments I’ve done with the tool_choice parameter and the observations I’ve made:
- tool_choice="auto" – Default behavior where the model decides which tools to use. No tool calls were made for the queries I tested.
- tool_choice="required" – Only the Bing Grounding tool was called for the queries tested.
- tool_choice={"type": "bing_grounding"} – Only the Bing Grounding tool was called for the queries tested.
- tool_choice={"type": "azure_ai_search"} – Surprisingly, Both Azure AI Search and Bing Grounding tools were called for the queries tested. Is this expected behavior, or could it be a bug? Also, do you know if this behavior can be consistently reproduced across different queries or environments?
-
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-11-14T18:24:28.25+00:00 Thanks for sharing your additional observations this is really helpful, and it also aligns with how tool orchestration currently behaves in the Azure AI Agent Service.
Let me clarify what you’re seeing and why.
Why
tool_choice={"type": "azure_ai_search"}Sometimes Triggers Both ToolsYes it is possible (and expected in some cases) that specifying a single tool in
tool_choice={"type": "azure_ai_search"}results in multiple tools being called, even though only one tool was explicitly requested.This happens because:
The
tool_choiceparameter tells the model which tool it must call at least once.But it does not prohibit the model from calling additional tools afterward if it believes they are helpful for answering the query.
This behavior is intentional, because Azure AI Agents still allow the model to reason about tool usage beyond the enforced tool.
tool_choice={"type": "azure_ai_search"}guarantees Azure AI Search will be invoked, But it does not restrict the model from calling Bing Grounding afterwardThis is the same core reason the reverse does not happen when
bing_groundingis enforced, the model is not necessarily motivated to call Azure AI Search unless your query strongly signals it.Is this behavior reproducible?
You may see variations across:
The query wording
Temperature / model randomness
The specific dataset in Azure AI Search
How your tools are described in the schema
Any additional reasoning the LLM performs during tool orchestration
Across multiple customers, we have observed that:
Forcing Azure AI Search first often results in both tools being used
Forcing Bing Grounding typically results in Bing-only execution
tool_choice="auto"frequently leads to no tools being used when the LLM thinks it can answer from prior knowledgeSo yes, your results are consistent with what we see internally. However, because the underlying orchestration is LLM-driven, exact reproducibility across environments cannot be guaranteed.
Is it a bug?
No based on the current architecture, this is expected behavior, not a bug.
The orchestration engine is designed to:
Enforce one required tool (if specified)
Allow additional tool calls at the model’s discretion
Drive reasoning based on context rather than strict deterministic workflows
This means the model retains autonomy unless you fully externalize the workflow in your application code.
Recommended approach for deterministic execution
Since you want strict ordering (Azure AI Search → Bing Grounding → Response), the safest approach remains:
Call Azure AI Search in a separate run
Pass its results into the next run
Force Bing Grounding in the second run
Generate the final response in a third run (if needed)
This fully removes LLM variability and guarantees predictable execution.
I hope this helps, do let me know if you have any further queries.
If this answers your query, could you please take a moment to retake the survey by accepting this response? Your feedback is greatly appreciated.
Thank you!
-
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-11-17T05:40:54.3266667+00:00 Did you get any chance to review the above response. Do let me know if you have any further queries.
Could you please take a moment to retake the survey on the above response? Your feedback is greatly appreciated.
Thank you!
-
Mahesh Babaji Sondkar • 1 Reputation point
2025-11-24T13:43:01.02+00:00 Sorry for the delayed response.
If I can invoke them in separate runs, like this as you suggested;
# Example: Forcing Azure AI Search first run = project_client.agents.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice={"type": "azure_ai_search"} # Call Azure AI Search first ) # After completion, trigger Bing Grounding run = project_client.agents.runs.create_and_process( thread_id=thread.id, agent_id=agent.id, tool_choice={"type": "bing_grounding"} # Call Bing Grounding next )On the first run, both the
azure_ai_searchandbing_groundingtools are always called, while on the second run only thebing_groundingtool is called. This ends up increasing the overall token consumption. Is there a way to force only a single tool to run each time? -
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-11-26T17:42:15.7666667+00:00 Thanks for the follow-up and for testing the sequential-run pattern. What you’re observing is expected based on how the Azure AI Agent orchestration layer currently works, and unfortunately there is no supported way to guarantee that only a single tool runs per turn when using an Agent.
Let me break down why this happens and what your options are.
Why Both Tools Run on the First Run (Even with tool_choice={"type": "azure_ai_search"})
Even though you explicitly specify:
tool_choice={"type": "azure_ai_search"}this only enforces one requirement:
Azure AI Search must be called at least once.
It does not enforce:
Only Azure AI Search may be called, Do not call any other tool, Stop after the required tool completes.
The LLM-driven orchestration layer is allowed to:
call additional tools if it believes they help answer the query,
chain tools based on its own reasoning,
continue execution after fulfilling the required tool.
This is why Bing Grounding is being triggered automatically in the first run.
This is expected behavior, not a bug.
Why the Second Run Only Calls Bing Grounding
This is consistent as well because:
You are explicitly forcing the model to call Bing Grounding.
There is no strong reason for the model to call Azure AI Search afterward.
So it chooses only Bing Grounding.
This also aligns with internal behavior patterns seen across multiple customers
Is there a way to force only one tool to run each time?
Within the Azure AI Agent orchestration: No.
There is currently no supported mechanism to force the agent to:
use exactly one tool per run,
prevent a tool from being called,
stop the model from invoking additional tools,
disable the LLM’s autonomous tool reasoning.
The tool system only supports enforced inclusion, not enforced exclusion.
This means:
You can guarantee a tool will run.
You cannot guarantee that only that tool will run.
This is a fundamental limitation of the Agent architecture today.
Workaround: Bypass Agent Orchestration and Call Tools Manually
If strict control and minimized token usage are required, the recommended approach is:
Do NOT let the agent call the tools.
Call the tools yourself.
How this looks:
1.Call Azure AI Search yourself (manual API call)
You directly call Azure AI Search using the Search REST API or SDK. No LLM involvement → zero extra tokens
2.Call Bing Grounding yourself
Use the Bing Search / Grounding tool API directly. Again → no unwanted extra tool calls.
3. Pass both results to the model as context
Call the agent (or plain chat completion) with:
search results (internal)
grounding results (external)
Example:
final_response = client.responses.create( model="gpt-4.1", input=[ {"role": "system", "content": "Summarize internal + external search results."}, {"role": "user", "content": { "internal_results": azure_search_result, "external_results": bing_result }} ] )Only the tools you choose run, In the exact order you choose, Zero accidental extra tool calls, Minimum token consumption, Full determinism
Why This Is the Only Reliable Solution Today
Because Azure AI Agent Service is intentionally designed as:
LLM-driven orchestration → not rule-based workflow automation.
The model will always reserve the right to:
call additional tools,
chain tools,
skip tools,
take steps based on reasoning rather than rules.
Therefore, the only way to completely eliminate unintended tool calls is to remove tool invocation from the agent and perform the calls externally.
Thank you!
-
Mahesh Babaji Sondkar • 1 Reputation point
2025-11-27T12:53:02.68+00:00 But the Bing Grounding tool, specifically "Grounding with Bing Search," is only designed to be used within the context of Azure AI Agent Service. It is not intended to be invoked as a standalone service.
-
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-11-28T13:01:35.42+00:00 Thanks for calling this out and yes, you’re absolutely right. The Bing Grounding tool provided inside Azure AI Agents is not exposed as an independent public API, so you cannot call it standalone the way you can with Azure AI Search. That is an intentional design of the service today.
Here’s the correct clarification:
Bing Grounding cannot be called outside the Agent
The “Grounding with Bing Search” tool only exists inside the Agent runtime, and there is no standalone REST API or SDK to call it directly. So when I suggested “manual invocation,” that only applies to tools that do have public APIs (e.g., Azure AI Search).
For Bing Grounding specifically, you only have these options today:
1. Use Bing Grounding inside the Agent (mandatory)
Since the tool is agent-only, the agent will always execute it internally. And because the orchestration is LLM-driven, the agent may decide to call additional tools unless tightly controlled which is exactly what you’re seeing.
There is currently no way to force “only this tool, and no others” inside a single run.
2. If you need strict control, replace Bing Grounding with Bing Web Search API
Azure technically supports a Bing Web Search API (separate from the agent’s grounding tool). This is not the same as “Bing Grounding,” but it gives you:
deterministic control (you call it manually)
no unexpected tool chaining
predictable token usage
ability to run it before or after Azure AI Search
Flow becomes:
Call Azure AI Search yourself , Call Bing Web Search API yourself, Pass both results to the model
This avoids the agent’s autonomous behavior.
3. If your use case requires the built-in Bing Grounding tool
Then unfortunately:
You cannot restrict it to run alone
You cannot prevent other tools from running alongside it
You cannot call it externally
You cannot fully eliminate extra token usage
Using the grounding tool inside the agent means accepting LLM-driven tool orchestration.
Since your requirement is:
strict execution order
only one tool per run
predictable cost
no unintended tool calls
The only viable solution today is:
Use Azure AI Search API directly
Use Bing Web Search API directly (not Bing Grounding)
Feed both into the model manually
This gives you full workflow control that the Agent cannot provide today.
Thank you!
-
SRILAKSHMI C • 10,805 Reputation points • Microsoft External Staff • Moderator
2025-12-02T06:38:29.1133333+00:00 Did you get any chance to review the above response. Do let me know if you have any further queries.
Could you please take a moment to retake the survey on the above response? Your feedback is greatly appreciated.
Thank you!
-
Aryan Parashar • 3,380 Reputation points • Microsoft External Staff • Moderator
2025-12-04T10:18:24.67+00:00 I appreciate your efforts, and I know it’s frustrating when the agent does not behave as expected.
As you are using the Foundry Agent Service using this link, and you are trying to force sequential execution of tools such as Azure AI Search (Knowledge Base) and a Bing Grounding Agent (to leverage Bing search for agent responses), there is a known limitation in the Foundry Agent Service.
If you force the execution of any tool in the flow, the service may not consistently maintain the full conversation context, and this can lead to the model starting to hallucinate or produce incomplete responses, because its natural reasoning process is being overridden. If you allow the agent to decide the tool choice on its own, then it is more likely to keep the context of the entire conversation.
Also,
tool_choicewill force the tool execution, but not in a sequential manner, which means it can still become inconsistent and start hallucinating instead of producing a coherent response.To mitigate these types of issues, Microsoft provides a separate SDK called Semantic Kernel, where you can create agents including tools like
Azure AI Search (Knowledge Base)and aBing Grounding Agent, and force sequential execution of these tools with better preservation of context and reduced hallucination.For your specific scenario, Semantic Kernel supports a method called sequential orchestration, which lets you explicitly control the order of tool execution. You can find the documentation here:
https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-orchestration/sequential?pivots=programming-language-pythonBut as you are using the Foundry Agent Service, it may continue to respond unpredictably or start hallucinating whenever tool execution is forced due to these limitations.
Let me know if you have further queries.
Sign in to comment