Complete Guide to OpenAI Bots (GPTBot · ChatGPT-User · OAI-SearchBot · OAI-AdsBot)
What are OpenAI bots
OpenAI does not operate a single bot but four separate crawlers by purpose. Rather than handling all collection with one User-Agent, training, citation, search, and ad verification are each handled by different bots. The key advantage of this structure is that you can selectively block specific bots in robots.txt.
TL;DR
You must distinguish GPTBot (training), ChatGPT-User (user browsing), OAI-SearchBot (ChatGPT Search index), and OAI-AdsBot (ad verification). If you want to block training but allow AI answer citations, the recommended setup is to block only GPTBot and allow the rest.
Bot identification information
The information below is from OpenAI's official documentation (developers.openai.com/api/docs/bots, verified June 2026).
| Bot Name | robots.txt Key | Primary Use | IP Range Published |
|---|---|---|---|
| GPTBot | GPTBot | AI model training data collection | openai.com/gptbot.json |
| ChatGPT-User | ChatGPT-User | When users use ChatGPT browsing features | openai.com/chatgpt-user.json |
| OAI-SearchBot | OAI-SearchBot | ChatGPT Search index building | openai.com/searchbot.json |
| OAI-AdsBot | OAI-AdsBot | ChatGPT ad safety verification (not used for training) | — |
User-Agent strings (official documentation)
# GPTBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.3; +https://openai.com/gptbot
# ChatGPT-User
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
# OAI-SearchBot
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot
# OAI-AdsBot
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-AdsBot/1.0; +https://openai.com/adsbot
⚠️ Warning User-Agent version numbers (e.g., /1.3) may change. When filtering server logs, matching on GPTBot alone without the version number is safer.
How each bot works
GPTBot — for training
GPTBot collects web data for OpenAI's AI model training. Collected content may be used for pre-training or fine-tuning of GPT-family models. Blocking via robots.txt stops future training data collection but does not affect data already collected.
ChatGPT-User — user browsing
ChatGPT-User activates when users enter URLs or use browsing features in ChatGPT to fetch those pages. OpenAI's official documentation states that "because this bot's visits are user-initiated, robots.txt rules may not apply." It is directly connected to ChatGPT answer citations.
OAI-SearchBot — ChatGPT Search index
OAI-SearchBot builds the search index for ChatGPT's web search feature (ChatGPT Search). It controls whether your site is included in OpenAI's own index, separate from the Bing index.
OAI-AdsBot — ad verification
It verifies the safety of pages registered as ChatGPT ads by advertisers. Collected data is not used for model training.
Three robots.txt examples
Scenario A. Full allow (default — no action needed)
# No separate configuration required. All OpenAI bots operate under default policy.
Scenario B. Block training only, allow answer citations and search (recommended for SMBs)
# GPTBot: block training data collection
User-agent: GPTBot
Disallow: /
# ChatGPT-User, OAI-SearchBot, OAI-AdsBot are allowed (default)
# → Maintains ChatGPT answer citations and ChatGPT Search exposure
Scenario C. Full block
# Block all OpenAI bots
# ChatGPT answer citations and ChatGPT Search exposure may also disappear
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: OAI-AdsBot
Disallow: /
Recommended scenarios (SMB baseline)
General SMBs (cafes, clinics, agencies, etc.): Scenario B recommended. Minimize training data provision while maintaining opportunities for exposure in ChatGPT answers.
Content asset businesses (media, education, publishing): Choose Scenario C if sensitive to unauthorized content training. Note that exposure in ChatGPT answers and search will disappear.
Maximum AI exposure strategy: Scenario A (full allow). Providing content as AI training data increases the long-term likelihood of being cited as an authoritative source in AI answers.
Verification — checking bot traffic in server logs
# Filter OpenAI bots in Nginx access.log
grep -iE "GPTBot|ChatGPT-User|OAI-SearchBot|OAI-AdsBot" /var/log/nginx/access.log \
| awk '{print $4, $7, $12}' \
| tail -50
# Check bot IP ranges (published JSON files)
# curl https://openai.com/gptbot.json
# curl https://openai.com/chatgpt-user.json
Frequently asked questions
Q. If I block GPTBot, will I stop appearing in ChatGPT answers?
A. Not necessarily. ChatGPT answer citations are primarily handled by ChatGPT-User and OAI-SearchBot. Blocking GPTBot only stops training data provision; answer citation channels remain open. To block answer citations as well, you must also block ChatGPT-User and OAI-SearchBot.
Q. How long after changing robots.txt does it take effect?
A. GPTBot typically recognizes updated robots.txt within days to weeks. ChatGPT-User may apply immediately since it operates in real time on user requests. OpenAI does not officially specify exact timing.
Q. Is it true that ChatGPT-User ignores robots.txt?
A. OpenAI's official documentation states that "because ChatGPT-User visits are user-initiated, robots.txt rules may not apply." In other words, full blocking via robots.txt is not guaranteed.
Q. If the version number in the User-Agent string changes, will blocking stop working?
A. robots.txt matches on the bot name (GPTBot, ChatGPT-User, etc.), not the full User-Agent string. Blocking remains in effect as long as the bot name is the same, even if the version number changes.
Q. What's the difference between IP range blocking and robots.txt blocking?
A. robots.txt blocking is a "policy notice," and whether bots respect it depends on operator policy. IP range blocking physically rejects requests at the server level. IP blocking is stronger but requires maintenance when OpenAI changes IP ranges. Using both methods together is most reliable.
References
- OpenAI official bot documentation: https://developers.openai.com/api/docs/bots (verified June 2026)
- GPTBot IP range: https://openai.com/gptbot.json
- ChatGPT-User IP range: https://openai.com/chatgpt-user.json
- OAI-SearchBot IP range: https://openai.com/searchbot.json
이 페이지를 참조하는 항목
- 📕ChecklistAI Bot robots.txt Matrix — Comprehensive Comparison and Setup Guide
- 📘ConceptComplete Guide to Anthropic Bots (ClaudeBot · Claude-User · Claude-SearchBot)
- 📘ConceptComplete Guide to Applebot-Extended — Apple Intelligence Training Control Token
- 📘ConceptCCBot (Common Crawl) Complete Guide
- 📘ConceptGoogle-Extended Complete Guide — A Policy Token, Not a Bot
- 📘ConceptComplete Guide to Perplexity Bots (PerplexityBot · Perplexity-User)
- 📙How-toCopilot Citation Optimization