llms.txt Writing Guide
What is llms.txt
llms.txt is a markdown file placed at the site root (/llms.txt) that helps LLMs (large language models) efficiently understand a site's purpose and key content. It is an emerging standard proposed by Jeremy Howard (fast.ai founder) in September 2024.
It is not yet a finalized industry standard. The specification is published via llmstxt.org, and some AI systems and LLMs have begun recognizing this file, but there is no guarantee that all AIs read it. Adoption cost is low and potential value exists, so proactive implementation is worthwhile.
What problem llms.txt solves
When AI crawls a site, it is difficult to determine what is core among hundreds of pages. Homepages, terms of service, error pages, and internal links are all collected equally. llms.txt directly tells LLMs "here is this site's most important content."
If sitemap.xml provides a page list, llms.txt explains each page's meaning and importance. If structured data (JSON-LD) defines individual page meaning, llms.txt provides site-wide context.
robots.txt · sitemap.xml · llms.txt comparison
| File | Role | Format | Location |
|---|---|---|---|
| robots.txt | Bot access allow/block | Custom text | /robots.txt |
| sitemap.xml | Page list provision | XML | /sitemap.xml |
| llms.txt | Content meaning and structure guide | Markdown | /llms.txt |
These three files are complementary, not competing. The ideal structure is: allow AI bot access via robots.txt, provide page lists via sitemap.xml, and guide core content via llms.txt.
llms.txt standard structure
The standard structure defined at llmstxt.org is as follows. Only the H1 title is required; the rest are recommendations.
# Site Name
> One-line site description (state core purpose in BLUF pattern)
## About
Site purpose, operator, primary audience, license, etc.
## Docs (or Index, Blog — section names are flexible)
- [Page Title](https://example.com/page-url): One-line description
- [Page Title](https://example.com/page-url): One-line description
## Optional
- [Supplementary page](https://example.com/misc): LLM may omit this section
Description of each element:
- H1: Site or project name
- Blockquote (>): Core purpose in one sentence
- H2 sections: Content categories. Section names are flexible
- Link list: URL + one-line description. Basis for LLM to understand each page's purpose
- Optional section: Lower-priority supplementary content. LLM may omit when context is limited
ALLEO Wiki application example
An llms.txt example applicable to the actual ALLEO Wiki (alleo.wiki):
# ALLEO Search Optimization Wiki
> ALLEO is a search optimization wiki covering AEO (Answer Engine Optimization) and GEO (Generative Engine Optimization). It explains how to get content cited in AI answer engines such as ChatGPT, Perplexity, and Google AI Overviews.
## About
- Operator: ALLEO (kroffle.com)
- Audience: Marketers, SEO professionals, startup founders
- License: Source attribution recommended when citing content
## Core Concepts
- [What is AEO?](https://alleo.wiki/what-is-aeo): Definition and need for AI answer engine optimization
- [What is GEO?](https://alleo.wiki/what-is-geo): Definition of generative AI optimization and Princeton research-based strategies
- [GEO Master Guide](https://alleo.wiki/geo-master-guide): Five-area checklist
## Technical Setup
- [Allow AI bots in robots.txt](https://alleo.wiki/robots-txt-ai-bot): robots.txt setup to allow AI crawlers
- [llms.txt writing guide](https://alleo.wiki/llms-txt): This document
- [JSON-LD basics](https://alleo.wiki/json-ld-basics): How to insert structured data
## Optional
- [AI Visibility Score](https://alleo.wiki/ai-visibility-score): Metric for measuring AI answer exposure
- [AI Share of Voice](https://alleo.wiki/ai-share-of-voice): AI citation share vs competitors
5-step implementation
Step 1: Inventory core content
Select 10–30 pages LLMs must know about. Listing every page reduces effectiveness.
Step 2: Write llms.txt
Write in markdown following the standard structure above. Include a one-line description for each link.
Step 3: Place at site root
It must be accessible at https://yourdomain.com/llms.txt. Serve as a static file or generate via a route in frameworks like Next.js.
// Next.js App Router: app/llms.txt/route.ts
export async function GET() {
const content = `# My Site\n\n> Site description...`
return new Response(content, {
headers: { 'Content-Type': 'text/plain' },
})
}
Step 4: Auto-generate at build (optional)
For sites with frequent content updates, configure the build pipeline to automatically refresh llms.txt.
Step 5: Add llms-full.txt (optional)
You can separately create llms-full.txt with core content body in markdown. LLMs can understand content without visiting URLs, but file size grows.
llms.txt vs llms-full.txt
| llms.txt | llms-full.txt | |
|---|---|---|
| Content | Index (title + URL + description) | Core content body included directly |
| File size | Small (few KB) | Large (hundreds of KB to MB) |
| LLM processing | Understand after visiting URL | Understand immediately within file |
| Updates | Easy | Regeneration needed when content changes |
Verification
After deployment, verify using:
- Visit https://yourdomain.com/llms.txt directly in a browser to confirm content
- Ask Claude or ChatGPT: "Read [domain]'s llms.txt following llmstxt.org guidelines and explain the site's purpose" to test LLM recognition
- Confirm robots.txt does not block /llms.txt
Market adoption
llms.txt adoption rates are still very low as of 2026. First-mover advantage is possible.
- Language: Can be written in any language, but English + local language bilingual format is recommended for global AI systems
- Next.js/Vercel: Dynamic generation via app/llms.txt/route.ts or static file at public/llms.txt
- WordPress: Upload file to root without a plugin, or create virtual path in functions.php
- Hosted platforms: Upload llms.txt to the root directory via FTP or file manager
2026 status: where llms.txt stands
llms.txt is an unofficial standard proposed by Jeremy Howard (fast.ai) in September 2024. Nearly two years later, as of 2026, none of OpenAI, Anthropic, or Google have officially stated they crawl or prioritize llms.txt. Current LLM citations happen through standard search indexes and proprietary training data; there is no confirmed evidence that llms.txt presence directly affects AI answer exposure.
Google's May 2026 AI search exposure guide also centers standard SEO and Schema.org structured data as primary recommendations. Separate manifests like llms.txt are not included in official guidance.
That does not mean llms.txt is meaningless. It has internal management value as a content inventory and site manifest, and some developer tools like Claude Desktop and Cursor AI do read this file. There is also option value if AI bots adopt it as a standard in the future. With near-zero operational cost to place one file at the root, sites with an SEO foundation already in place can maintain it without burden.
Current recommended priority: ① Specify AI bot policy in robots.txt (GPTBot, ClaudeBot, Google-Extended, etc.). ② Maintain XML sitemap and standard structured data (JSON-LD). ③ Add llms.txt as a supplementary measure afterward.
For comprehensive comparison of AI bot robots.txt policies and copy-ready templates by scenario, see AI Bot robots.txt Matrix.
Frequently asked questions
Do all LLMs read llms.txt?
No. As of 2026, only some AI systems recognize llms.txt. It is an emerging convention proposed at llmstxt.org and not yet finalized as an industry standard. SE Ranking's survey of 300,000 domains found no statistical correlation between llms.txt adoption (~10%) and LLM citation rates. However, placement cost is low, and some developer tools like Claude Desktop and Cursor AI are confirmed to read it, so adoption is recommended.
How is it different from sitemap.xml?
sitemap.xml provides a list of page URLs in XML format to help search engines discover all pages. llms.txt selects core pages and explains them in markdown so LLMs quickly grasp site context. Roles differ, so operating both is best.
Should small blogs create one too?
It helps, but priority is lower. Apply robots.txt AI bot allow rules and structured data (JSON-LD) first; add llms.txt in a later stage.
Can I create llms.txt first if my site has no robots.txt?
Without robots.txt, all bot access is allowed by default. You can place llms.txt without robots.txt. However, if you want to explicitly allow AI bots, creating robots.txt together is recommended.
Related sources
- llms.txt official specification: https://llmstxt.org/
- Jeremy Howard original proposal (September 2024): https://llmstxt.org/
이 페이지를 참조하는 항목
- 📕ChecklistAI Bot robots.txt Matrix — Comprehensive Comparison and Setup Guide
- 📘ConceptComplete Guide to Anthropic Bots (ClaudeBot · Claude-User · Claude-SearchBot)
- 📘ConceptComplete Guide to Applebot-Extended — Apple Intelligence Training Control Token
- 📘ConceptCCBot (Common Crawl) Complete Guide
- 📘ConceptGoogle-Extended Complete Guide — A Policy Token, Not a Bot
- 📘ConceptComplete Guide to OpenAI Bots (GPTBot · ChatGPT-User · OAI-SearchBot · OAI-AdsBot)
- 📘ConceptComplete Guide to Perplexity Bots (PerplexityBot · Perplexity-User)
- 📘ConceptGEO Master Guide: 5-Area Checklist
- 📘ConceptWhat Is AEO?
- 📘ConceptWhat Is GEO?
- 📘ConceptKorean LLM Optimization
- 📘ConceptInternal Linking Strategy
- 📙How-toChatGPT Citation Optimization
- 📙How-toClaude Citation Optimization
- 📙How-toCopilot Citation Optimization
- 📙How-toGemini Citation Optimization
- 📘ConceptGoogle AI Overviews
- 📙How-toGrok Citation Optimization
- 📙How-toPerplexity Citation Optimization
- 📘ConceptJSON-LD Basics
- 📘ConceptCrawlability
- 📘ConceptJavaScript SEO
- 📙How-toHow to Allow AI Bots in robots.txt