Noindex

Definition

noindex is a crawl control mechanism that instructs search engines not to register a web page in the search index. It is delivered via robots meta tags or HTTP response headers.

noindex separates crawling from indexing. Bots continue visiting the page (crawling), read the noindex directive, and do not display the page in search results. Blocking page access entirely with robots.txt prevents reading the noindex directive, so results differ. See Crawling vs Indexing for details.

Summary

noindex essentials: ①<meta name="robots" content="noindex"/> → insert in <head> → ②Bots crawl but exclude from indexing → ③Blocking with robots.txt prevents reading noindex, so it has no effect → ④Suitable targets: thank-you pages, login, internal search results, parameter pages → ⑤For complete page deletion, 410 response code is more reliable.

7 Suitable Targets for noindex

1. Thank-You and Confirmation Pages

Post-transaction pages such as payment confirmation and form submission completion. Search exposure provides meaningless user experience and wastes crawl budget.

2. Login and Registration Pages

Login and registration pages for authentication-required services have no value to unauthenticated visitors. noindex focuses crawl budget on actual content pages.

3. Internal Search Result Pages

On-site search result pages in the form /search?q=keyword. Infinite URL combinations exhaust crawl budget, and Google may evaluate such pages as low-quality auto-generated pages.

4. URL Parameter Duplicate Pages

Parameter variant URLs that continue to be crawled despite canonical tag handling can have noindex applied additionally. See URL Parameters for details.

5. Thin Content Pages

Hundreds of thin listing pages created by category filters, tag archive pages, etc. noindex pages with no unique value to improve overall site indexing quality. See Content Pruning for details.

6. Staging and Test Environments

Apply noindex to test servers like staging.example.com before production deployment to prevent accidental Google indexing.

7. Personal and Internal Documents

Internal documents that should not be public but are accidentally crawlable. Authentication protection takes priority; noindex is a secondary measure.

noindex Implementation Methods

HTML Meta Tag (Most Common)

<head>
  <meta name="robots" content="noindex" />
</head>

To block crawling as well:

<meta name="robots" content="noindex, nofollow" />

Control specific bots only:

<meta name="googlebot" content="noindex" />

HTTP Header (For Non-HTML Resources)

PDF, images, JavaScript files, etc.:

X-Robots-Tag: noindex

noindex vs nofollow Difference

noindex: Excludes this page from the index. Links on the page are still followed.
nofollow: Does not follow links on this page. The page itself may still be indexed.
noindex, nofollow: Excludes from indexing and does not follow links simultaneously.

noindex vs robots.txt Difference

[COMPARISON_TABLE: noindex vs robots.txt differences]

noindex (meta/header)

Bot visit: ✅ Allowed
Indexing: ❌ Excluded
Link following: Configurable separately
PageRank: Can pass (if no nofollow)
Suitable situation: Allow access but exclude from search

robots.txt Disallow

Bot visit: ❌ Blocked
Indexing: Gray area (blocking prevents reading noindex)
Link following: ❌ Blocked
PageRank: Does not pass
Suitable situation: Crawl budget protection, complete blocking of sensitive resources

Important: Pages blocked by robots.txt cannot be read for noindex directives. To exclude from indexing only, noindex meta tags must be used while crawling is allowed.

See robots.txt and AI Bots for details.

Re-indexing After Removing noindex

After removing noindex, it takes time for Google to re-index the page. For faster processing:

Use "URL Inspection → Request Indexing" in Google Search Console
Include the URL in the XML sitemap and submit
Verify internal links point to the page

See Indexing Coverage for details.

Application in the Korean Market

Naver Search noindex Support

Naver search bot (Yeti) supports <meta name="robots" content="noindex"/>. However, a more reliable method to control Naver search exposure is using URL blocking in Naver Search Advisor.

noindex Cases in Korean E-commerce

Common noindex applications on Korean e-commerce sites:

Sort filter URLs (?sort=price, ?sort=latest)
Cart and order completion pages
Member-only my page
Out-of-stock product temporary pages (noindex if restocking expected; 410 if permanently discontinued)

Implementation by CMS

In WordPress, set page-level noindex via Yoast SEO or RankMath plugin "Search appearance" settings. In Next.js, set robots: { index: false } in generateMetadata().

Frequently Asked Questions

Q. Does a noindex page disappear from search immediately?
A. No. Google removes it from the index only after crawling the page and reading noindex. This process can take days to weeks. For fast removal, use Google Search Console's "URL Removal" as a temporary measure, but fundamentally maintain noindex or 410 response.

Q. What happens if I accidentally set noindex on an important page?
A. Google removes it from the index on the next crawl. Remove noindex immediately upon discovery and request re-indexing in Google Search Console. Recovering previous rankings can take weeks. Maintaining a QA checklist to prevent noindex mistakes on staging before deployment is important.

Q. Are links from noindex pages to other pages also ignored?
A. With noindex alone, links are still followed (PageRank can pass). To block link following as well, use noindex, nofollow together. However, most noindex target pages (thank-you pages, login pages) have no external links, so noindex alone is typical.

Q. Is there a better method than noindex to remove a page entirely?
A. For permanent page deletion, 410 (Gone) HTTP status code is most reliable. Google recognizes 410 and quickly removes the URL from the index. Use noindex when the page exists but should not appear in search; use 410 when deleting the page itself.

Q. Can I use canonical tags and noindex on the same page?
A. Not recommended. Canonical requests "treat this URL as representative" for indexing; noindex requests "do not index." They contradict each other. Google tends to prioritize noindex in such cases, but confusion can cause unexpected results. Use only one per page.

Sources

Google Search Central (2024). Block search indexing with noindex. Google Developers.
Google Search Central (2024). robots.txt vs noindex — Which should I use? Google Search Central Blog.
John Mueller, Google (2023). How Google processes noindex directives. Google Search Central.