Noindex
Definition
noindex is a crawl control mechanism that instructs search engines not to register a web page in the search index. It is delivered via robots meta tags or HTTP response headers.
noindex separates crawling from indexing. Bots continue visiting the page (crawling), read the noindex directive, and do not display the page in search results. Blocking page access entirely with robots.txt prevents reading the noindex directive, so results differ. See Crawling vs Indexing for details.
Summary
noindex essentials: ①<meta name="robots" content="noindex"/> → insert in <head> → ②Bots crawl but exclude from indexing → ③Blocking with robots.txt prevents reading noindex, so it has no effect → ④Suitable targets: thank-you pages, login, internal search results, parameter pages → ⑤For complete page deletion, 410 response code is more reliable.
7 Suitable Targets for noindex
1. Thank-You and Confirmation Pages
Post-transaction pages such as payment confirmation and form submission completion. Search exposure provides meaningless user experience and wastes crawl budget.
2. Login and Registration Pages
Login and registration pages for authentication-required services have no value to unauthenticated visitors. noindex focuses crawl budget on actual content pages.
3. Internal Search Result Pages
On-site search result pages in the form /search?q=keyword. Infinite URL combinations exhaust crawl budget, and Google may evaluate such pages as low-quality auto-generated pages.
4. URL Parameter Duplicate Pages
Parameter variant URLs that continue to be crawled despite canonical tag handling can have noindex applied additionally. See URL Parameters for details.
5. Thin Content Pages
Hundreds of thin listing pages created by category filters, tag archive pages, etc. noindex pages with no unique value to improve overall site indexing quality. See Content Pruning for details.
6. Staging and Test Environments
Apply noindex to test servers like staging.example.com before production deployment to prevent accidental Google indexing.
7. Personal and Internal Documents
Internal documents that should not be public but are accidentally crawlable. Authentication protection takes priority; noindex is a secondary measure.
noindex Implementation Methods
HTML Meta Tag (Most Common)
<head>
<meta name="robots" content="noindex" />
</head>
To block crawling as well:
<meta name="robots" content="noindex, nofollow" />
Control specific bots only:
<meta name="googlebot" content="noindex" />
HTTP Header (For Non-HTML Resources)
PDF, images, JavaScript files, etc.:
X-Robots-Tag: noindex
noindex vs nofollow Difference
- noindex: Excludes this page from the index. Links on the page are still followed.
- nofollow: Does not follow links on this page. The page itself may still be indexed.
- noindex, nofollow: Excludes from indexing and does not follow links simultaneously.
noindex vs robots.txt Difference
[COMPARISON_TABLE: noindex vs robots.txt differences]
noindex (meta/header)
- Bot visit: ✅ Allowed
- Indexing: ❌ Excluded
- Link following: Configurable separately
- PageRank: Can pass (if no nofollow)
- Suitable situation: Allow access but exclude from search
robots.txt Disallow
- Bot visit: ❌ Blocked
- Indexing: Gray area (blocking prevents reading noindex)
- Link following: ❌ Blocked
- PageRank: Does not pass
- Suitable situation: Crawl budget protection, complete blocking of sensitive resources
Important: Pages blocked by robots.txt cannot be read for noindex directives. To exclude from indexing only, noindex meta tags must be used while crawling is allowed.
See robots.txt and AI Bots for details.
Re-indexing After Removing noindex
After removing noindex, it takes time for Google to re-index the page. For faster processing:
- Use "URL Inspection → Request Indexing" in Google Search Console
- Include the URL in the XML sitemap and submit
- Verify internal links point to the page
See Indexing Coverage for details.
Application in the Korean Market
Naver Search noindex Support
Naver search bot (Yeti) supports <meta name="robots" content="noindex"/>. However, a more reliable method to control Naver search exposure is using URL blocking in Naver Search Advisor.
noindex Cases in Korean E-commerce
Common noindex applications on Korean e-commerce sites:
- Sort filter URLs (?sort=price, ?sort=latest)
- Cart and order completion pages
- Member-only my page
- Out-of-stock product temporary pages (noindex if restocking expected; 410 if permanently discontinued)
Implementation by CMS
In WordPress, set page-level noindex via Yoast SEO or RankMath plugin "Search appearance" settings. In Next.js, set robots: { index: false } in generateMetadata().
Frequently Asked Questions
Q. Does a noindex page disappear from search immediately?
A. No. Google removes it from the index only after crawling the page and reading noindex. This process can take days to weeks. For fast removal, use Google Search Console's "URL Removal" as a temporary measure, but fundamentally maintain noindex or 410 response.
Q. What happens if I accidentally set noindex on an important page?
A. Google removes it from the index on the next crawl. Remove noindex immediately upon discovery and request re-indexing in Google Search Console. Recovering previous rankings can take weeks. Maintaining a QA checklist to prevent noindex mistakes on staging before deployment is important.
Q. Are links from noindex pages to other pages also ignored?
A. With noindex alone, links are still followed (PageRank can pass). To block link following as well, use noindex, nofollow together. However, most noindex target pages (thank-you pages, login pages) have no external links, so noindex alone is typical.
Q. Is there a better method than noindex to remove a page entirely?
A. For permanent page deletion, 410 (Gone) HTTP status code is most reliable. Google recognizes 410 and quickly removes the URL from the index. Use noindex when the page exists but should not appear in search; use 410 when deleting the page itself.
Q. Can I use canonical tags and noindex on the same page?
A. Not recommended. Canonical requests "treat this URL as representative" for indexing; noindex requests "do not index." They contradict each other. Google tends to prioritize noindex in such cases, but confusion can cause unexpected results. Use only one per page.
Sources
- Google Search Central (2024). Block search indexing with noindex. Google Developers.
- Google Search Central (2024). robots.txt vs noindex — Which should I use? Google Search Central Blog.
- John Mueller, Google (2023). How Google processes noindex directives. Google Search Central.