/Google-Extended Complete Guide — A Policy Token, Not a Bot
📘Concept⭐️ Pillar

Google-Extended Complete Guide — A Policy Token, Not a Bot

최종 업데이트:

What Is Google-Extended?

Google-Extended is a concept many people misunderstand. It is not a bot. It has no independent User-Agent string and no separate crawler. Google-Extended is a policy token (control token) declared in robots.txt that controls only whether content already collected by Googlebot may be used for training Google's AI products.

This distinction matters: blocking Google-Extended does not stop Googlebot from crawling. It also does not affect Google Search rankings or visibility.


TL;DR

Google-Extended = a robots.txt token that controls consent for Gemini model training + Vertex AI grounding use. Because it is not a bot, blocking by IP is meaningless. Blocking it does not affect Google Search or AI Overviews exposure.


Bot vs Token — The Decisive Difference

ItemStandard AI bot (GPTBot, etc.)Google-Extended
TypeIndependent crawlerrobots.txt policy token
Own User-Agent✅ Yes❌ No (reuses Googlebot UA)
IP rangePublished per botSame as Googlebot IP range
Blocking methodrobots.txt or IP blockrobots.txt token setting only
Blocking effectBlocks crawling itselfCrawling continues; only specific use is restricted

What Google-Extended Controls

According to Google's official documentation (developers.google.com/search/docs/crawling-indexing/google-common-crawlers, updated April 2026), Google-Extended controls two uses:

  1. Gemini model training: Using site content collected by Google as training data for Gemini-family AI models
  2. Vertex AI grounding: Using site content as grounding evidence in Vertex AI–based services

Google's official documentation explicitly states:

"Google-Extended does not affect whether a site is included in Google Search, and it is not used as a ranking signal for Google Search."


Google Crawler Role Separation

Google operates multiple crawlers, each with a different purpose.

Crawler / TokenTypePrimary use
GooglebotCrawlerGoogle Search, Discover, Images, News, etc.
Googlebot-ImageCrawlerGoogle Image Search
Google-CloudVertexBotCrawlerVertex AI Agents crawling
Google-ExtendedTokenConsent for Gemini training · Vertex AI grounding

Even if you block Google-Extended, other crawlers such as Googlebot and Googlebot-Image continue to operate normally.


AI Overviews and Google-Extended

Many webmasters ask, "If I block Google-Extended, will I disappear from AI Overviews?" Google's official documentation does not provide an explicit answer. However, AI Overviews runs on the Google Search index, which Googlebot maintains. Because the Google-Extended token does not affect Googlebot crawling, it is difficult to argue that it directly affects AI Overviews exposure.

⚠️ Note Google has not officially clarified the relationship between Google-Extended and AI Overviews. The above is a reasonable interpretation within the scope of official documentation.


Three robots.txt Examples

Scenario A. Full allow (default — no action needed)

# No separate configuration required.
# Googlebot crawls as usual and data may also be used for Gemini training.

Scenario B. Block Gemini training only (keep Google Search · AI Overviews exposure)

# Block Gemini training and Vertex AI grounding
User-agent: Google-Extended
Disallow: /

# Googlebot continues crawling → search visibility maintained

Scenario C. Block Gemini training for specific paths only

# Exclude /private/ from Gemini training only
User-agent: Google-Extended
Disallow: /private/

# Remaining paths remain allowed for Gemini training

Recommended Scenarios

Most small and medium businesses: Scenario B is recommended. It limits unrestricted use of content for Gemini training without affecting Google Search visibility.

When Google Search visibility is the top priority: Scenarios A and B have the same effect on search. Google-Extended does not affect search rankings.

Strategic AI training contribution: Scenario A. Industry observers suggest that contributing high-quality content to Gemini training may, over time, help a site be recognized as an authoritative source in AI answers.


Verification Methods

Because Google-Extended has no separate User-Agent, it cannot be identified directly in server logs. Confirm whether settings are applied using Google's robots.txt testing tool or Google Search Console.

# Google Search Console → URL Inspection → check robots.txt rules
# Or use the robots.txt testing tool in Google Search Central

Frequently Asked Questions

Q. Can I block Google-Extended by IP?
A. It is meaningless. Google-Extended is a policy token, not a separate bot. Blocking Googlebot IPs stops all Googlebot crawling and removes your site from Google Search. There is no way to selectively block Google-Extended by IP alone.

Q. Does blocking Google-Extended affect AI Overviews exposure?
A. Google officially states it "does not affect inclusion in Google Search or rankings." Because AI Overviews is index-based, there is currently no evidence that blocking Google-Extended directly affects AI Overviews exposure.

Q. Are Google-Extended and Applebot-Extended the same mechanism?
A. Structurally, yes. Both tokens are robots.txt policy tokens—not crawlers—that control whether each company's AI models may use collected data for training. The configuration pattern is the same.

Q. If I block Google-Extended, will I disappear from Bard/Gemini answers?
A. Google does not explicitly state this in official documentation. Gemini's real-time web search features rely on the Googlebot index and are not directly tied to this token channel.


References

이 페이지를 참조하는 항목

이런 항목도 있어요

이 페이지가 도움이 됐나요?