Google-Extended Complete Guide — A Policy Token, Not a Bot

What Is Google-Extended?

Google-Extended is a concept many people misunderstand. It is not a bot. It has no independent User-Agent string and no separate crawler. Google-Extended is a policy token (control token) declared in robots.txt that controls only whether content already collected by Googlebot may be used for training Google's AI products.

This distinction matters: blocking Google-Extended does not stop Googlebot from crawling. It also does not affect Google Search rankings or visibility.

TL;DR

Google-Extended = a robots.txt token that controls consent for Gemini model training + Vertex AI grounding use. Because it is not a bot, blocking by IP is meaningless. Blocking it does not affect Google Search or AI Overviews exposure.

Bot vs Token — The Decisive Difference

Item	Standard AI bot (GPTBot, etc.)	Google-Extended
Type	Independent crawler	robots.txt policy token
Own User-Agent	✅ Yes	❌ No (reuses Googlebot UA)
IP range	Published per bot	Same as Googlebot IP range
Blocking method	robots.txt or IP block	robots.txt token setting only
Blocking effect	Blocks crawling itself	Crawling continues; only specific use is restricted

What Google-Extended Controls

According to Google's official documentation (developers.google.com/search/docs/crawling-indexing/google-common-crawlers, updated April 2026), Google-Extended controls two uses:

Gemini model training: Using site content collected by Google as training data for Gemini-family AI models
Vertex AI grounding: Using site content as grounding evidence in Vertex AI–based services

Google's official documentation explicitly states:

"Google-Extended does not affect whether a site is included in Google Search, and it is not used as a ranking signal for Google Search."

Google Crawler Role Separation

Google operates multiple crawlers, each with a different purpose.

Crawler / Token	Type	Primary use
Googlebot	Crawler	Google Search, Discover, Images, News, etc.
Googlebot-Image	Crawler	Google Image Search
Google-CloudVertexBot	Crawler	Vertex AI Agents crawling
Google-Extended	Token	Consent for Gemini training · Vertex AI grounding

Even if you block Google-Extended, other crawlers such as Googlebot and Googlebot-Image continue to operate normally.

AI Overviews and Google-Extended

Many webmasters ask, "If I block Google-Extended, will I disappear from AI Overviews?" Google's official documentation does not provide an explicit answer. However, AI Overviews runs on the Google Search index, which Googlebot maintains. Because the Google-Extended token does not affect Googlebot crawling, it is difficult to argue that it directly affects AI Overviews exposure.

⚠️ Note Google has not officially clarified the relationship between Google-Extended and AI Overviews. The above is a reasonable interpretation within the scope of official documentation.

Three robots.txt Examples

Scenario A. Full allow (default — no action needed)

# No separate configuration required.
# Googlebot crawls as usual and data may also be used for Gemini training.

Scenario B. Block Gemini training only (keep Google Search · AI Overviews exposure)

# Block Gemini training and Vertex AI grounding
User-agent: Google-Extended
Disallow: /

# Googlebot continues crawling → search visibility maintained

Scenario C. Block Gemini training for specific paths only

# Exclude /private/ from Gemini training only
User-agent: Google-Extended
Disallow: /private/

# Remaining paths remain allowed for Gemini training

Recommended Scenarios

Most small and medium businesses: Scenario B is recommended. It limits unrestricted use of content for Gemini training without affecting Google Search visibility.

When Google Search visibility is the top priority: Scenarios A and B have the same effect on search. Google-Extended does not affect search rankings.

Strategic AI training contribution: Scenario A. Industry observers suggest that contributing high-quality content to Gemini training may, over time, help a site be recognized as an authoritative source in AI answers.

Verification Methods

Because Google-Extended has no separate User-Agent, it cannot be identified directly in server logs. Confirm whether settings are applied using Google's robots.txt testing tool or Google Search Console.

# Google Search Console → URL Inspection → check robots.txt rules
# Or use the robots.txt testing tool in Google Search Central

Frequently Asked Questions

Q. Can I block Google-Extended by IP?
A. It is meaningless. Google-Extended is a policy token, not a separate bot. Blocking Googlebot IPs stops all Googlebot crawling and removes your site from Google Search. There is no way to selectively block Google-Extended by IP alone.

Q. Does blocking Google-Extended affect AI Overviews exposure?
A. Google officially states it "does not affect inclusion in Google Search or rankings." Because AI Overviews is index-based, there is currently no evidence that blocking Google-Extended directly affects AI Overviews exposure.

Q. Are Google-Extended and Applebot-Extended the same mechanism?
A. Structurally, yes. Both tokens are robots.txt policy tokens—not crawlers—that control whether each company's AI models may use collected data for training. The configuration pattern is the same.

Q. If I block Google-Extended, will I disappear from Bard/Gemini answers?
A. Google does not explicitly state this in official documentation. Gemini's real-time web search features rely on the Googlebot index and are not directly tied to this token channel.

References

Google official Common Crawlers documentation: https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers (updated April 2026, verified June 2026)