Complete Guide to Anthropic Bots (ClaudeBot · Claude-User · Claude-SearchBot)
What are Anthropic bots
Anthropic operates three web crawlers for Claude AI. Like OpenAI, they separate bots by purpose, and each can be controlled independently via robots.txt. Anthropic explicitly commits to robots.txt compliance in its official documentation.
TL;DR
Distinguish ClaudeBot (training), Claude-User (user browsing), and Claude-SearchBot (search index). Anthropic officially states it honors robots.txt and does not use CAPTCHA bypass technology. IP ranges are available at claude.com/crawling/bots.json.
Bot identification information
The information below is based on Anthropic's official support documentation (support.claude.com, verified June 2026).
| Bot Name | robots.txt Key | Primary Use | Block Effect |
|---|---|---|---|
| ClaudeBot | ClaudeBot | AI model training data collection | Signal to exclude from AI training datasets |
| Claude-User | Claude-User | When users access URLs in Claude | May reduce visibility in user web search |
| Claude-SearchBot | Claude-SearchBot | Index for Claude search quality improvement | May reduce search result accuracy and visibility |
IP range check: https://claude.com/crawling/bots.json
Anthropic's official documentation advises contacting Anthropic support if suspicious crawl traffic is found, including your domain.
How each bot works
ClaudeBot — for training
ClaudeBot collects web data for Claude model training. Anthropic's official documentation describes blocking ClaudeBot as "sending a signal that you will be excluded from AI training datasets." Blocking via robots.txt stops future training collection but does not affect data already collected.
Claude-User — user browsing
It activates when users submit specific URLs to Claude via Claude.ai or the Claude API, or request web search. Anthropic's documentation states that blocking may "reduce your site's visibility in user-based web search." It is related to real-time Claude answer citations.
Claude-SearchBot — search index
It builds an index to improve results for Claude's in-product search features. Blocking may reduce your site's exposure accuracy in Claude search.
Anthropic's crawling policy commitments
As stated in Anthropic's official documentation:
- ✅ robots.txt compliance: Respects "do not crawl" signals
- ✅ Non-invasive crawling: Does not disrupt site operations
- ✅ Transparency: Publicly provides information about its crawlers
- ✅ No CAPTCHA bypass: Does not use CAPTCHA bypass technology
- ✅ Separate subdomain application: robots.txt on main domains and subdomains is applied separately
Three robots.txt examples
Scenario A. Full allow (default)
# No separate configuration required. All Anthropic bots operate under default policy.
Scenario B. Block training only, allow answer citations and search (recommended)
# ClaudeBot: block training data collection
User-agent: ClaudeBot
Disallow: /
# Claude-User, Claude-SearchBot are allowed
# → Maintains Claude answer citations and search exposure
Scenario C. Full block
User-agent: ClaudeBot
Disallow: /
User-agent: Claude-User
Disallow: /
User-agent: Claude-SearchBot
Disallow: /
Subdomain note
# Subdomains must be configured separately (Anthropic official documentation)
# blog.example.com/robots.txt needs the same settings
Recommended scenarios (SMB baseline)
General SMBs: Scenario B recommended. Block only ClaudeBot to minimize training data provision while maintaining Claude answer citation opportunities.
Content asset businesses: Scenario C. Block both training and citations. Exposure in Claude will disappear.
Maximum AI exposure strategy: Scenario A (full allow). Anthropic's robots.txt compliance makes this a high-trust choice.
Verification
# Filter Anthropic bots in server logs
grep -iE "ClaudeBot|Claude-User|Claude-SearchBot" /var/log/nginx/access.log \
| awk '{print $4, $7, $12}' \
| tail -50
# Check IP ranges (reference for bot verification only — IP blocking not recommended)
# curl https://claude.com/crawling/bots.json
⚠️ IP blocking not recommended Anthropic's official documentation states that "blocking IP addresses may prevent Anthropic from reading robots.txt, so opt-out may not be correctly or consistently guaranteed." When blocking Anthropic bots, robots.txt is officially recommended over IP blocking.
Frequently asked questions
Q. If I block ClaudeBot, will I stop appearing in Claude answers?
A. Blocking ClaudeBot is not directly connected to Claude answer citations. Answer citations are primarily handled by Claude-User and Claude-SearchBot. Blocking ClaudeBot only stops training data provision; Claude's real-time answer citation channels remain open.
Q. What is the relationship between Claude Citations API and ClaudeBot?
A. Anthropic's Citations API is a feature for developers to specify sources in answers via the Claude API. It is a separate system from ClaudeBot crawling. The Citations API works from documents developers provide and is not directly connected to ClaudeBot's web crawl data.
Q. Does Anthropic really honor robots.txt?
A. Anthropic explicitly commits to robots.txt compliance and not using CAPTCHA bypass in its official documentation. Unlike controversies involving OpenAI or Perplexity, there are no publicly reported cases of Anthropic ignoring robots.txt.
Q. Do I need separate settings for subdomains?
A. Yes. Anthropic's official documentation states that robots.txt on subdomains must be applied separately from the main domain. Settings for blog.example.com must be added separately at blog.example.com/robots.txt.
Q. Can I set Crawl-delay?
A. Yes. Anthropic supports the non-standard Crawl-delay directive. To reduce crawl frequency, you can add Crawl-delay: 10 (in seconds).
References
- Anthropic official crawler documentation: https://support.claude.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler (verified June 2026)
- Anthropic bot IP ranges: https://claude.com/crawling/bots.json
이 페이지를 참조하는 항목
- 📕ChecklistAI Bot robots.txt Matrix — Comprehensive Comparison and Setup Guide
- 📘ConceptGoogle-Extended Complete Guide — A Policy Token, Not a Bot
- 📘ConceptComplete Guide to OpenAI Bots (GPTBot · ChatGPT-User · OAI-SearchBot · OAI-AdsBot)
- 📘ConceptComplete Guide to Perplexity Bots (PerplexityBot · Perplexity-User)
- 📙How-toClaude Citation Optimization