/llms.txt Writing Guide
📙How-to

llms.txt Writing Guide

최종 업데이트:

What is llms.txt

llms.txt is a markdown file placed at the site root (/llms.txt) that helps LLMs (large language models) efficiently understand a site's purpose and key content. It is an emerging standard proposed by Jeremy Howard (fast.ai founder) in September 2024.

It is not yet a finalized industry standard. The specification is published via llmstxt.org, and some AI systems and LLMs have begun recognizing this file, but there is no guarantee that all AIs read it. Adoption cost is low and potential value exists, so proactive implementation is worthwhile.

What problem llms.txt solves

When AI crawls a site, it is difficult to determine what is core among hundreds of pages. Homepages, terms of service, error pages, and internal links are all collected equally. llms.txt directly tells LLMs "here is this site's most important content."

If sitemap.xml provides a page list, llms.txt explains each page's meaning and importance. If structured data (JSON-LD) defines individual page meaning, llms.txt provides site-wide context.

robots.txt · sitemap.xml · llms.txt comparison

FileRoleFormatLocation
robots.txtBot access allow/blockCustom text/robots.txt
sitemap.xmlPage list provisionXML/sitemap.xml
llms.txtContent meaning and structure guideMarkdown/llms.txt

These three files are complementary, not competing. The ideal structure is: allow AI bot access via robots.txt, provide page lists via sitemap.xml, and guide core content via llms.txt.

llms.txt standard structure

The standard structure defined at llmstxt.org is as follows. Only the H1 title is required; the rest are recommendations.

# Site Name

> One-line site description (state core purpose in BLUF pattern)

## About

Site purpose, operator, primary audience, license, etc.

## Docs (or Index, Blog — section names are flexible)

- [Page Title](https://example.com/page-url): One-line description
- [Page Title](https://example.com/page-url): One-line description

## Optional

- [Supplementary page](https://example.com/misc): LLM may omit this section

Description of each element:

  • H1: Site or project name
  • Blockquote (>): Core purpose in one sentence
  • H2 sections: Content categories. Section names are flexible
  • Link list: URL + one-line description. Basis for LLM to understand each page's purpose
  • Optional section: Lower-priority supplementary content. LLM may omit when context is limited

ALLEO Wiki application example

An llms.txt example applicable to the actual ALLEO Wiki (alleo.wiki):

# ALLEO Search Optimization Wiki

> ALLEO is a search optimization wiki covering AEO (Answer Engine Optimization) and GEO (Generative Engine Optimization). It explains how to get content cited in AI answer engines such as ChatGPT, Perplexity, and Google AI Overviews.

## About

- Operator: ALLEO (kroffle.com)
- Audience: Marketers, SEO professionals, startup founders
- License: Source attribution recommended when citing content

## Core Concepts

- [What is AEO?](https://alleo.wiki/what-is-aeo): Definition and need for AI answer engine optimization
- [What is GEO?](https://alleo.wiki/what-is-geo): Definition of generative AI optimization and Princeton research-based strategies
- [GEO Master Guide](https://alleo.wiki/geo-master-guide): Five-area checklist

## Technical Setup

- [Allow AI bots in robots.txt](https://alleo.wiki/robots-txt-ai-bot): robots.txt setup to allow AI crawlers
- [llms.txt writing guide](https://alleo.wiki/llms-txt): This document
- [JSON-LD basics](https://alleo.wiki/json-ld-basics): How to insert structured data

## Optional

- [AI Visibility Score](https://alleo.wiki/ai-visibility-score): Metric for measuring AI answer exposure
- [AI Share of Voice](https://alleo.wiki/ai-share-of-voice): AI citation share vs competitors

5-step implementation

Step 1: Inventory core content

Select 10–30 pages LLMs must know about. Listing every page reduces effectiveness.

Step 2: Write llms.txt

Write in markdown following the standard structure above. Include a one-line description for each link.

Step 3: Place at site root

It must be accessible at https://yourdomain.com/llms.txt. Serve as a static file or generate via a route in frameworks like Next.js.

// Next.js App Router: app/llms.txt/route.ts
export async function GET() {
  const content = `# My Site\n\n> Site description...`
  return new Response(content, {
    headers: { 'Content-Type': 'text/plain' },
  })
}

Step 4: Auto-generate at build (optional)

For sites with frequent content updates, configure the build pipeline to automatically refresh llms.txt.

Step 5: Add llms-full.txt (optional)

You can separately create llms-full.txt with core content body in markdown. LLMs can understand content without visiting URLs, but file size grows.

llms.txt vs llms-full.txt

llms.txtllms-full.txt
ContentIndex (title + URL + description)Core content body included directly
File sizeSmall (few KB)Large (hundreds of KB to MB)
LLM processingUnderstand after visiting URLUnderstand immediately within file
UpdatesEasyRegeneration needed when content changes

Verification

After deployment, verify using:

  1. Visit https://yourdomain.com/llms.txt directly in a browser to confirm content
  2. Ask Claude or ChatGPT: "Read [domain]'s llms.txt following llmstxt.org guidelines and explain the site's purpose" to test LLM recognition
  3. Confirm robots.txt does not block /llms.txt

Market adoption

llms.txt adoption rates are still very low as of 2026. First-mover advantage is possible.

  • Language: Can be written in any language, but English + local language bilingual format is recommended for global AI systems
  • Next.js/Vercel: Dynamic generation via app/llms.txt/route.ts or static file at public/llms.txt
  • WordPress: Upload file to root without a plugin, or create virtual path in functions.php
  • Hosted platforms: Upload llms.txt to the root directory via FTP or file manager

2026 status: where llms.txt stands

llms.txt is an unofficial standard proposed by Jeremy Howard (fast.ai) in September 2024. Nearly two years later, as of 2026, none of OpenAI, Anthropic, or Google have officially stated they crawl or prioritize llms.txt. Current LLM citations happen through standard search indexes and proprietary training data; there is no confirmed evidence that llms.txt presence directly affects AI answer exposure.

Google's May 2026 AI search exposure guide also centers standard SEO and Schema.org structured data as primary recommendations. Separate manifests like llms.txt are not included in official guidance.

That does not mean llms.txt is meaningless. It has internal management value as a content inventory and site manifest, and some developer tools like Claude Desktop and Cursor AI do read this file. There is also option value if AI bots adopt it as a standard in the future. With near-zero operational cost to place one file at the root, sites with an SEO foundation already in place can maintain it without burden.

Current recommended priority: ① Specify AI bot policy in robots.txt (GPTBot, ClaudeBot, Google-Extended, etc.). ② Maintain XML sitemap and standard structured data (JSON-LD). ③ Add llms.txt as a supplementary measure afterward.

For comprehensive comparison of AI bot robots.txt policies and copy-ready templates by scenario, see AI Bot robots.txt Matrix.

Frequently asked questions

Do all LLMs read llms.txt?
No. As of 2026, only some AI systems recognize llms.txt. It is an emerging convention proposed at llmstxt.org and not yet finalized as an industry standard. SE Ranking's survey of 300,000 domains found no statistical correlation between llms.txt adoption (~10%) and LLM citation rates. However, placement cost is low, and some developer tools like Claude Desktop and Cursor AI are confirmed to read it, so adoption is recommended.

How is it different from sitemap.xml?
sitemap.xml provides a list of page URLs in XML format to help search engines discover all pages. llms.txt selects core pages and explains them in markdown so LLMs quickly grasp site context. Roles differ, so operating both is best.

Should small blogs create one too?
It helps, but priority is lower. Apply robots.txt AI bot allow rules and structured data (JSON-LD) first; add llms.txt in a later stage.

Can I create llms.txt first if my site has no robots.txt?
Without robots.txt, all bot access is allowed by default. You can place llms.txt without robots.txt. However, if you want to explicitly allow AI bots, creating robots.txt together is recommended.

Related sources

이 페이지를 참조하는 항목

관련 항목

📘Concept
Crawl Budget
Crawl budget is the number of pages Googlebot can and wants to crawl on your site within a given period — relevant for large sites where crawl allocation affects indexing speed and coverage.
📘Concept
Google Search Console
Google Search Console (GSC) is a free tool from Google for monitoring site search performance, diagnosing indexing issues, and submitting sitemaps — the essential foundation for SEO measurement.
📙How-to
Indexing Coverage Diagnosis
Indexing coverage diagnosis uses the GSC indexing report to check overall site indexing status, identify causes of unindexed pages, and fix them — a core SEO task.
📘ConceptPillar
GEO Master Guide: 5-Area Checklist
An execution guide for Generative AI Optimization covering GEO's five areas: content, structure, technical, off-site, and measurement.
📘ConceptPillar
What Is AEO?
AEO is the practice of optimizing content so AI answer engines cite it.
📘ConceptPillar
What Is GEO?
GEO is the practice of optimizing content so generative AI cites it in answers.
📘ConceptPillar
Korean LLM Optimization
Korean LLM optimization is the work of optimizing content so global AI answer engines cite your content when answering Korean-language questions. Because Korean represents a smaller share of training data than English, it presents both higher barriers and distinct opportunities compared with English AEO.
📘ConceptPillar
Internal Linking Strategy
Internal linking strategy is the practice of semantically connecting pages within your own site to optimize topic authority and bot and user navigation.
📙How-to
ChatGPT Citation Optimization
ChatGPT citation optimization is the work of getting content cited in ChatGPT answers.
📘Concept
Google AI Overviews
Google AI Overviews is a feature that adds AI answer blocks to search SERPs.
📙How-to
Perplexity Citation Optimization
Perplexity citation optimization is the work of securing citations from a real-time web search-based AI.
📘ConceptPillar
JSON-LD Basics
JSON-LD is the Schema.org structured data insertion method recommended by Google.
📙How-to
How to Allow AI Bots in robots.txt
Allowing AI bots means explicitly permitting major AI crawlers such as GPTBot, ClaudeBot, and PerplexityBot to access your site in robots.txt, exposing your content for citation in generative AI answers.

이런 항목도 있어요

이 페이지가 도움이 됐나요?