Hi Ma, Le,
Welcome to Microsoft Q&A and thank you for posting your query here!
Regarding your question on Azure Web Application Firewall (WAF) Bot Manager Rule Set 1.1 and the detection of "Good Bots," here are the key points:
What are "Good Bots"?
"Good Bots" in Azure WAF Bot Manager are trusted, verified bots that perform legitimate functions like search engine indexing, link checking, advertising, or social media services. The Bot Manager categorizes bots into Good Bots, Bad Bots, and Unknown Bots:
- Good Bots include well-known, verified crawlers such as Googlebot, Bingbot, FacebookBot, LinkedInBot, and others that are validated and trusted.
- Bad Bots are those with malicious intent, often identified by suspicious IPs or modified user agents.
- Unknown Bots are those without verification and may or may not be malicious.
Bot Manager 1.1 Enhancements:
The Bot Manager 1.1 rule set enhances the detection and classification capabilities introduced in 1.0, providing better accuracy and expanding the list of recognized good bots. It reduces false positives by allowing a broader range of legitimate bots without blocking them, which is critical for SEO and operational continuity.
About AI Crawlers Like ChatGPT, Claude, Bing Copilot:
Currently, the official bot signatures recognized as "Good Bots" primarily cover established search engines, social media crawlers, and other verified services. AI crawlers such as ChatGPT, Claude, or Bing Copilot are generally not part of the default known "Good Bots" list because:
They may not identify themselves as standard crawlers in their user agents.
Their crawling patterns and IP ranges might not yet be verified or included in the WAF's managed bot signatures.
As a result, these AI crawlers might be classified as "Unknown Bots" by default.
What does this mean for your WAF "Block Unknown Bots" setting?
- Blocking Unknown Bots can help secure your site from untrusted crawlers but may inadvertently block emerging or non-standard crawlers such as AI-based ones.
- To avoid unintended blocking of legitimate AI tools accessing your site, you might consider:
- Monitoring WAF logs for blocked requests by these crawlers.
- Creating custom exclusions or allowlists for known IP addresses or user agents of trusted AI crawlers as you identify them.
- Starting your bot policy in Detection or Log mode to understand bot traffic before switching to blocking mode.
Summary:
- Azure WAF Bot Manager Rule Set 1.1 identifies and allows well-known search engine and social media bots as "Good Bots."
- AI crawlers like ChatGPT, Claude, and Bing Copilot are typically not automatically identified as "Good Bots" by default.
- Blocking Unknown Bots could block some legitimate AI crawlers unless explicitly allowed by custom rules or exceptions.
- It is recommended to monitor and customize your WAF bot policy as you move to blocking mode to ensure important legitimate traffic isn’t blocked.
References:
Azure WAF Bot Manager overview
If you need any assistance on this, feel free to ask!
If the provided information answers your query, do click "Upvote" and "Accept Answer", it will help others who might be facing similar challenges.
Thanks,
Harish.