Google-Extended Complete Guide — A Policy Token, Not a Bot
What Is Google-Extended?
Google-Extended is a concept many people misunderstand. It is not a bot. It has no independent User-Agent string and no separate crawler. Google-Extended is a policy token (control token) declared in robots.txt that controls only whether content already collected by Googlebot may be used for training Google's AI products.
This distinction matters: blocking Google-Extended does not stop Googlebot from crawling. It also does not affect Google Search rankings or visibility.
TL;DR
Google-Extended = a robots.txt token that controls consent for Gemini model training + Vertex AI grounding use. Because it is not a bot, blocking by IP is meaningless. Blocking it does not affect Google Search or AI Overviews exposure.
Bot vs Token — The Decisive Difference
| Item | Standard AI bot (GPTBot, etc.) | Google-Extended |
|---|---|---|
| Type | Independent crawler | robots.txt policy token |
| Own User-Agent | ✅ Yes | ❌ No (reuses Googlebot UA) |
| IP range | Published per bot | Same as Googlebot IP range |
| Blocking method | robots.txt or IP block | robots.txt token setting only |
| Blocking effect | Blocks crawling itself | Crawling continues; only specific use is restricted |
What Google-Extended Controls
According to Google's official documentation (developers.google.com/search/docs/crawling-indexing/google-common-crawlers, updated April 2026), Google-Extended controls two uses:
- Gemini model training: Using site content collected by Google as training data for Gemini-family AI models
- Vertex AI grounding: Using site content as grounding evidence in Vertex AI–based services
Google's official documentation explicitly states:
"Google-Extended does not affect whether a site is included in Google Search, and it is not used as a ranking signal for Google Search."
Google Crawler Role Separation
Google operates multiple crawlers, each with a different purpose.
| Crawler / Token | Type | Primary use |
|---|---|---|
| Googlebot | Crawler | Google Search, Discover, Images, News, etc. |
| Googlebot-Image | Crawler | Google Image Search |
| Google-CloudVertexBot | Crawler | Vertex AI Agents crawling |
| Google-Extended | Token | Consent for Gemini training · Vertex AI grounding |
Even if you block Google-Extended, other crawlers such as Googlebot and Googlebot-Image continue to operate normally.
AI Overviews and Google-Extended
Many webmasters ask, "If I block Google-Extended, will I disappear from AI Overviews?" Google's official documentation does not provide an explicit answer. However, AI Overviews runs on the Google Search index, which Googlebot maintains. Because the Google-Extended token does not affect Googlebot crawling, it is difficult to argue that it directly affects AI Overviews exposure.
⚠️ Note Google has not officially clarified the relationship between Google-Extended and AI Overviews. The above is a reasonable interpretation within the scope of official documentation.
Three robots.txt Examples
Scenario A. Full allow (default — no action needed)
# No separate configuration required.
# Googlebot crawls as usual and data may also be used for Gemini training.
Scenario B. Block Gemini training only (keep Google Search · AI Overviews exposure)
# Block Gemini training and Vertex AI grounding
User-agent: Google-Extended
Disallow: /
# Googlebot continues crawling → search visibility maintained
Scenario C. Block Gemini training for specific paths only
# Exclude /private/ from Gemini training only
User-agent: Google-Extended
Disallow: /private/
# Remaining paths remain allowed for Gemini training
Recommended Scenarios
Most small and medium businesses: Scenario B is recommended. It limits unrestricted use of content for Gemini training without affecting Google Search visibility.
When Google Search visibility is the top priority: Scenarios A and B have the same effect on search. Google-Extended does not affect search rankings.
Strategic AI training contribution: Scenario A. Industry observers suggest that contributing high-quality content to Gemini training may, over time, help a site be recognized as an authoritative source in AI answers.
Verification Methods
Because Google-Extended has no separate User-Agent, it cannot be identified directly in server logs. Confirm whether settings are applied using Google's robots.txt testing tool or Google Search Console.
# Google Search Console → URL Inspection → check robots.txt rules
# Or use the robots.txt testing tool in Google Search Central
Frequently Asked Questions
Q. Can I block Google-Extended by IP?
A. It is meaningless. Google-Extended is a policy token, not a separate bot. Blocking Googlebot IPs stops all Googlebot crawling and removes your site from Google Search. There is no way to selectively block Google-Extended by IP alone.
Q. Does blocking Google-Extended affect AI Overviews exposure?
A. Google officially states it "does not affect inclusion in Google Search or rankings." Because AI Overviews is index-based, there is currently no evidence that blocking Google-Extended directly affects AI Overviews exposure.
Q. Are Google-Extended and Applebot-Extended the same mechanism?
A. Structurally, yes. Both tokens are robots.txt policy tokens—not crawlers—that control whether each company's AI models may use collected data for training. The configuration pattern is the same.
Q. If I block Google-Extended, will I disappear from Bard/Gemini answers?
A. Google does not explicitly state this in official documentation. Gemini's real-time web search features rely on the Googlebot index and are not directly tied to this token channel.
References
- Google official Common Crawlers documentation: https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers (updated April 2026, verified June 2026)