What Is llms.txt? The Complete Guide to the Proposed AI Web Standard

Q: Can I have llms.txt on a subpath instead of the root?

Yes. The specification allows placement in subpaths (e.g., yoursite.com/docs/llms.txt), though root placement is standard.

What Is llms.txt?
Who Created llms.txt and Why?
llms.txt vs. robots.txt vs. sitemap.xml
The llms.txt File Specification
Understanding llms-full.txt
How to Create an llms.txt File (Step-by-Step)
Real-World Examples from Major Companies
Adoption Statistics and Current State
Impact on SEO and Generative Engine Optimization
Benefits and Limitations
Common Mistakes to Avoid
Tools, Plugins, and Resources
Future Outlook: What Comes Next?
Should You Implement llms.txt?
Frequently Asked Questions

As artificial intelligence reshapes the way users discover and consume information online, website owners face a new challenge: making their content accessible not just to humans and search engine crawlers, but to large language models (LLMs) as well. Enter llms.txt — a proposed web standard designed to bridge the gap between traditional websites and AI systems.

In this comprehensive guide, we cover everything you need to know about llms.txt: what it is, how it works, how to implement it, who is using it, and whether it actually delivers measurable results.

What Is llms.txt?

llms.txt is a plain-text file written in Markdown format, placed at the root of a website (e.g., yoursite.com/llms.txt). Its purpose is to provide large language models with a structured, concise overview of a website’s most important content — making it easier for AI systems to understand, retrieve, and reference that information during inference.

Think of it as a curated table of contents specifically designed for AI consumption. While a traditional sitemap tells search engines which pages exist, and robots.txt tells crawlers which pages to avoid, llms.txt tells AI models which pages matter most and provides context about what each page contains.

The key distinction is that llms.txt is not about controlling access. It doesn’t block or allow anything. Instead, it curates and guides — pointing AI systems toward high-value, well-organized content.

Who Created llms.txt and Why?

The llms.txt proposal was introduced by Jeremy Howard, co-founder of Answer.AI and former president of fast.ai, on September 3, 2024. Howard identified a critical problem that LLMs face when trying to use website information:

Context window limitations — Even the largest models cannot process an entire website in a single prompt. A model needs to quickly identify the most relevant pages.
HTML complexity — Converting pages with navigation menus, advertisements, JavaScript widgets, and sidebars into clean text is both difficult and error-prone.
Information overload — Without guidance, an LLM may parse dozens of irrelevant pages before finding what it actually needs.

Howard’s solution was elegant in its simplicity: a single Markdown file that gives AI models a structured roadmap of a website’s key resources, using a format that LLMs are already highly proficient at parsing.

llms.txt vs. robots.txt vs. sitemap.xml

One of the most common misconceptions about llms.txt is that it is “robots.txt for AI.” This comparison, while intuitive, is fundamentally incorrect. Each file serves a distinct purpose in the web ecosystem:

Feature	robots.txt	sitemap.xml	llms.txt
Purpose	Controls crawler access	Helps crawlers discover pages	Curates content for AI models
Format	Plain text (custom syntax)	XML	Markdown
Introduced	1994	2005	2024
Audience	Search engine bots	Search engine bots	Large language models
Approach	Exclusion-based	Discovery-based	Curation-based
Industry Adoption	Universal	Universal	Emerging (~844K sites)
Enforced by AI Providers	Yes (by most crawlers)	Yes (by search engines)	Not officially yet
Training vs. Inference	Both	Both	Inference only

As Search Engine Land aptly described it, “llms.txt isn’t robots.txt — it’s a treasure map for AI.”

The llms.txt File Specification

The official specification defines a precise Markdown structure that enables both human readability and programmatic parsing. Here is the required format:

Required and Optional Sections

Section	Markdown Element	Required?	Description
Project Name	`# H1 Heading`	Yes	The name of the project or website
Summary	`> Blockquote`	No	A brief description of the project
Details	Paragraphs, lists	No	Additional context (no headings allowed here)
Content Sections	`## H2 Headings`	No	Named groups of resource links
Optional Section	`## Optional`	No	Secondary resources that can be skipped in shorter contexts

Example llms.txt File Structure

# Acme Corporation

> Acme Corporation provides cloud-based project management
> tools for enterprise teams.

Acme was founded in 2018 and serves over 50,000 businesses worldwide.
Our platform integrates with Slack, GitHub, and Jira.

## Core Documentation

- [Getting Started](https://acme.com/docs/start): Quick setup guide for new users
- [API Reference](https://acme.com/docs/api): Complete REST API documentation
- [Authentication](https://acme.com/docs/auth): OAuth2 and API key management

## Tutorials

- [Build Your First Project](https://acme.com/tutorials/first-project): Step-by-step walkthrough
- [Integrations Guide](https://acme.com/tutorials/integrations): Connect third-party tools

## Optional

- [Changelog](https://acme.com/changelog): Version history and release notes
- [Community Forum](https://acme.com/community): User discussions and support

Key Format Rules

The file must use Markdown formatting — not HTML, XML, or JSON
Only one H1 heading is allowed (the project name)
Content sections use H2 headings (##) only
Each resource link follows the format: - [Name](URL): Optional description
The special ## Optional section signals content that can be skipped when context windows are limited
Websites should also provide Markdown versions of individual pages by appending .md to URLs

Understanding llms-full.txt

Alongside llms.txt, the specification also defines llms-full.txt — a companion file that contains the complete content of all documentation in a single Markdown file.

Aspect	llms.txt	llms-full.txt
Content	Navigation structure with links	Full page content inline
File Size	Small (typically a few KB)	Large (can be hundreds of KB)
Use Case	Quick overview and navigation	Deep, comprehensive context
Best For	AI agents needing a starting point	AI tools that need everything at once

The llms-full.txt concept was originally developed by Mintlify in collaboration with Anthropic, who needed a way to feed their entire documentation into LLMs without having to parse HTML. For example, Anthropic’s own llms-full.txt file at docs.claude.com/llms-full.txt contains over 481,000 tokens of content.

How to Create an llms.txt File (Step-by-Step)

Step 1: Inventory Your Content

Before writing a single line, audit your website and identify the pages that provide the most value. Prioritize documentation, guides, API references, product descriptions, and high-quality editorial content. Exclude pages like privacy policies, cookie notices, and administrative pages.

Step 2: Write the File

Create a plain text file named llms.txt using any text editor. Start with the required H1 heading, add an optional blockquote summary, and organize your links under H2 sections.

Step 3: Follow Formatting Best Practices

Keep descriptions concise — aim for one sentence per link
Use consistent formatting throughout the file
Prioritize your most important resources in the first sections
Use the ## Optional section for supplementary content
Avoid jargon or ambiguous terms without context
Use absolute URLs rather than relative paths

Step 4: Upload and Validate

Place the file in your website’s root directory — the same location as robots.txt and index.html. Verify that:

The file is accessible at yoursite.com/llms.txt
It returns an HTTP 200 status code
It is served with text/plain or text/markdown MIME type
It uses UTF-8 character encoding
It requires no authentication to access

Step 5: Keep It Updated

Unlike robots.txt, which rarely changes, your llms.txt file should evolve with your content. When you publish new documentation, deprecate old pages, or restructure your site, update llms.txt accordingly.

Real-World Examples from Major Companies

Several industry-leading companies have implemented llms.txt, providing useful reference implementations:

Anthropic (Claude AI)

Anthropic publishes both docs.claude.com/llms.txt and docs.claude.com/llms-full.txt. Their implementation covers the entire Claude API documentation, organized by topic areas like prompt engineering, tool use, and model capabilities. The llms.txt file is approximately 8,364 tokens, while the full version exceeds 481,000 tokens.

Cloudflare

Cloudflare’s implementation organizes content by product (Workers, AI Gateway, R2, etc.), making it straightforward for AI systems to fetch documentation about a specific service. They also provide per-product full files, such as developers.cloudflare.com/agents/llms-full.txt.

Stripe

Stripe’s llms.txt at docs.stripe.com/llms.txt structures content by product categories — from core payment processing to specialized tools like Stripe Climate. It demonstrates how companies with diverse product lines can organize a single file effectively.

Others

The llms-txt-hub on GitHub tracks over 784 documented implementations, including Vercel, Zapier, Coinbase, ElevenLabs, and many more.

Adoption Statistics and Current State

Metric	Data
Websites with llms.txt (BuiltWith)	~844,000+
Documented implementations (GitHub)	784+
Adoption rate among surveyed domains	~10.13% of 300K domains
Dominant adopter category	Developer tools, SaaS, AI companies
Major AI providers officially using it	None confirmed officially
Google’s stance	Compared it to deprecated keywords meta tag
Proposal date	September 3, 2024

The adoption pattern tells a clear story: llms.txt has achieved strong penetration in developer-facing and AI-adjacent industries while remaining largely absent from the mainstream web. Mid-tier sites adopt it at slightly higher rates than the largest websites, suggesting that nimble, forward-thinking teams are leading the charge.

It is worth noting that Google’s John Mueller has compared llms.txt to the deprecated keywords meta tag, raising concerns about potential abuse through cloaking — showing AI systems different content than what human users see.

Impact on SEO and Generative Engine Optimization (GEO)

One of the most debated aspects of llms.txt is whether it actually helps websites appear more frequently in AI-generated answers. The evidence is mixed.

What the Data Shows

OtterlyAI conducted a 90-day experiment tracking AI crawler behavior after implementing llms.txt. Their conclusion: implementing llms.txt had no measurable short-term impact on AI search visibility.

Server log analysis from multiple sources showed that major AI crawlers (GPTBot, ClaudeBot, PerplexityBot) did not visit llms.txt files with any notable frequency — if at all.

What Actually Drives AI Visibility

Factor	Impact on AI Visibility	Priority
Content quality and depth	High	Critical
E-E-A-T signals	High	Critical
Traditional SEO fundamentals	High	Critical
Structured data / Schema.org	Medium-High	Important
Server-side rendering	Medium	Important
llms.txt implementation	Low (currently)	Nice to have

The consensus among GEO experts is clear: llms.txt can be a useful component of an AI optimization strategy, but it should never replace the fundamentals. Content quality, authority, structured data, and traditional SEO remain the primary drivers of visibility in AI-generated responses.

Benefits and Limitations

Benefits

Low implementation cost — Creating and maintaining an llms.txt file requires minimal time and no technical infrastructure changes
Future-proofing — If AI providers adopt the standard, early implementers will already be prepared
Improved developer experience — AI coding assistants can provide better suggestions when they have clean access to documentation
Content curation signal — Forces website owners to think critically about which content is most important
Agentic web readiness — As AI agents become more prevalent, having structured machine-readable content will become increasingly valuable

Limitations

No official adoption — No major AI provider has publicly committed to reading llms.txt files
No measurable SEO impact — Current data shows no correlation between llms.txt and improved AI search visibility
Abuse potential — The file could be used for cloaking, showing AI models different content than what users see
Maintenance overhead — The file must be updated whenever content changes, or it risks becoming stale and misleading
False sense of security — Some organizations may over-invest in llms.txt while neglecting core content quality

Common Mistakes to Avoid

Stuffing keywords into descriptions — AI models understand semantics; keyword stuffing provides no benefit and reduces readability
Including every page — The whole point is curation. Only include your most valuable resources
Using relative URLs — Always use absolute URLs to avoid ambiguity
Forgetting to update — A stale llms.txt with broken links is worse than having no file at all
Blocking llms.txt in robots.txt — Ensure your robots.txt does not disallow access to /llms.txt or any pages it references
Treating it as an SEO silver bullet — llms.txt is one small piece of a much larger AI optimization strategy
Using HTML instead of Markdown — The specification requires Markdown format
Adding more than one H1 heading — Only one H1 is allowed per the specification

Tools, Plugins, and Resources

Tool	Type	Description
llmstxt.org	Specification	Official proposal and documentation
llms-txt-hub (GitHub)	Directory	Largest collection of llms.txt implementations
llms_txt2ctx	CLI / Python	Parse and process llms.txt files programmatically
Mintlify	Platform	Auto-generates and updates llms.txt for hosted docs
GitBook	Platform	Automatically generates llms.txt files
vitepress-plugin-llms	Plugin	VitePress plugin for llms.txt generation
LLMTEXT by Parallel AI	Open source toolkit	Generate llms.txt from existing website content

Future Outlook: What Comes Next?

The future of llms.txt hinges on one key question: will a major AI provider officially adopt it?

Several scenarios are plausible:

Mainstream adoption (2026–2027) — A major platform (likely Anthropic or Microsoft) announces official support, triggering a cascade of adoption similar to how Open Graph Protocol became standard after Facebook championed it
Integration with MCP — llms.txt may merge with or become a component of the Model Context Protocol (MCP), becoming part of a broader AI interoperability framework
Google’s A2A connection — Google has already included llms.txt in their Agents to Agents (A2A) protocol, signaling at least experimental interest
Gradual irrelevance — AI models become sophisticated enough to parse any website format, making the structured file unnecessary

Vercel offers a compelling data point for optimism: the company claims 10% of their signups now come from ChatGPT, attributing this partially to their GEO efforts, which include llms.txt implementation.

Should You Implement llms.txt?

Website Type	Recommendation	Priority
Developer tools / API products	Strongly recommended	High
Documentation-heavy SaaS	Strongly recommended	High
AI and tech companies	Recommended	Medium-High
Content publishers / blogs	Worth experimenting	Medium
E-commerce	Low priority	Low
Small business / local sites	Not necessary yet	Low

The bottom line: if implementation takes less than an hour and your site contains valuable, structured content — especially technical documentation — there is little reason not to implement llms.txt. The downside risk is negligible, while the potential upside grows as AI systems continue to evolve.

Frequently Asked Questions

Does llms.txt affect my Google rankings?

No. There is no evidence that llms.txt influences traditional search rankings. Google has not adopted the standard and has expressed skepticism about its value.

Is llms.txt the same as robots.txt for AI?

No. robots.txt controls access (blocking or allowing crawlers), while llms.txt curates content (guiding AI models to the most useful pages). They serve fundamentally different purposes.

Can llms.txt prevent AI from training on my content?

No. llms.txt is designed for inference-time use, not training. To restrict training, use robots.txt directives or opt-out mechanisms provided by specific AI companies.

Do any AI models actually read llms.txt?

As of early 2026, no major AI provider has officially confirmed using llms.txt during crawling or inference. However, Anthropic collaborated on the llms-full.txt concept, and Google included llms.txt in their A2A protocol, suggesting growing interest.

How often should I update my llms.txt file?

Update it whenever you publish significant new content, deprecate pages, or restructure your site. A quarterly review is a reasonable minimum cadence for most websites.

Can I have llms.txt on a subpath instead of the root?

Yes. The specification allows placement in subpaths (e.g., yoursite.com/docs/llms.txt), though root placement is standard.

What is the difference between llms.txt and llms-full.txt?

llms.txt provides a concise navigation structure with links, while llms-full.txt contains the full content of all referenced pages in a single file. Think of llms.txt as the table of contents and llms-full.txt as the entire book.

ChatGPT Ads: What OpenAI’s New Ad Test Really Means

ChatGPT Trusted Contact: What OpenAI’s New Safety Feature Actually Does

How Simplex Uses Codex to Ship Software Faster

GPT-5.5 Cyber: OpenAI Gives Verified Defenders Serious AI Firepower

How OpenAI Runs Codex Safely Inside Real Companies

How Frontier Firms Are Pulling Ahead With AI: OpenAI’s B2B Signals Report

Table of Contents