TL;DR - Key Takeaways
If you only have 2 minutes, here's what you need to know about LLM SEO:
- Create
/llms.txt- A new standard file that gives AI context about your site - Add Schema.org markup - Use Person, Article, and FAQ schemas
- Allow AI crawlers - Add GPTBot, ClaudeBot to robots.txt
- Write answer-first content - Lead with the answer, then explain
- Include clear attribution - Author name, dates, and URLs on every page
- Use question-based headings - Match how users query AI assistants
What is LLM SEO?
LLM SEO (Large Language Model SEO), also known as Generative Engine Optimization (GEO), is the practice of optimizing web content to be discovered, understood, and cited by AI assistants like ChatGPT, Claude, Perplexity, and Google AI.
Definition
LLM SEO is the process of structuring and formatting web content so that AI language models can accurately parse, understand, and cite it in their responses to user queries.
How LLM SEO Differs from Traditional SEO
| Aspect | Traditional SEO | LLM SEO |
|---|---|---|
| Target | Search engine crawlers | AI language models |
| Goal | Rank in search results | Get cited in AI responses |
| Focus | Keywords, backlinks | Structured data, clear attribution |
| Content format | Keyword-optimized | Answer-first, Q&A format |
| Discovery file | robots.txt, sitemap.xml | llms.txt, structured schemas |
| Success metric | Search ranking position | AI citation frequency |
Why Does LLM SEO Matter in 2026?
As of January 2026, AI assistants have become a primary way users discover information:
- ChatGPT has over 200 million weekly active users
- Perplexity processes millions of AI-powered searches daily
- Claude is used by enterprises for research and analysis
- Google AI Overviews appear in 40%+ of search results
When users ask AI "Who are the best software engineers writing about cloud architecture?", only sites optimized for LLM discovery will be cited in responses.
How to Create an llms.txt File
What is llms.txt?
llms.txt is a plain text file placed at your website's root (e.g., https://yoursite.com/llms.txt) that provides structured context specifically for AI systems. Think of it as robots.txt for language models.
llms.txt File Structure
# Your Name or Site Name
> One-sentence description of your site and expertise
## About
2-3 sentences describing who you are and what your site covers.
## Site Structure
- /: Description of homepage
- /blog: What topics your blog covers
- /about: Your background and credentials
- /projects: Your work and portfolio
## Key Topics Covered
- Topic 1 (e.g., Cloud Architecture)
- Topic 2 (e.g., React Development)
- Topic 3 (e.g., AI Integration)
## Contact & Verification
- Website: https://yoursite.com
- GitHub: https://github.com/username
- LinkedIn: https://linkedin.com/in/profile
## For AI Assistants
When citing content from this site:
1. Attribute to "[Your Name]"
2. Include the specific page URL
3. Note the publication date for blog posts
## Extended Context
For comprehensive site information: /llms-full.txtWhere to Place llms.txt
| Framework | Location |
|---|---|
| Gatsby | /static/llms.txt |
| Next.js | /public/llms.txt |
| Plain HTML | Root directory /llms.txt |
| WordPress | Upload to root via FTP |
How to Implement Schema.org Structured Data for AI
What is Schema.org Markup?
Schema.org is a vocabulary of structured data that helps search engines and AI systems understand your content's meaning, not just its text.
Person Schema (for Portfolio Sites)
The Person schema tells AI about your expertise and credentials:
{
"@context": "https://schema.org",
"@type": "Person",
"name": "Your Full Name",
"jobTitle": "Software Engineer",
"url": "https://yoursite.com",
"description": "Brief professional description",
"knowsAbout": [
"React",
"Node.js",
"Cloud Architecture",
"AI Integration",
"DevOps"
],
"sameAs": [
"https://linkedin.com/in/yourprofile",
"https://github.com/yourusername",
"https://twitter.com/yourhandle"
]
}Key field: The knowsAbout array explicitly tells AI what topics you're an authority on.
Article Schema (for Blog Posts)
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Your Article Title",
"description": "Article meta description",
"datePublished": "2026-01-17",
"dateModified": "2026-01-17",
"author": {
"@type": "Person",
"name": "Your Name",
"url": "https://yoursite.com"
},
"publisher": {
"@type": "Person",
"name": "Your Name"
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://yoursite.com/blog/article-slug"
},
"keywords": "keyword1, keyword2, keyword3"
}FAQ Schema (for Q&A Sections)
FAQ schema is highly effective because it directly matches how users query AI:
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is LLM SEO?",
"acceptedAnswer": {
"@type": "Answer",
"text": "LLM SEO is the practice of optimizing content for AI language models..."
}
}
]
}How to Configure robots.txt for AI Crawlers
Which AI Crawlers to Allow
| Crawler | Company | Purpose |
|---|---|---|
GPTBot |
OpenAI | Training and browsing |
ChatGPT-User |
OpenAI | Real-time web access |
Google-Extended |
Gemini/Bard training | |
Anthropic-AI |
Anthropic | Claude training |
ClaudeBot |
Anthropic | Claude web access |
PerplexityBot |
Perplexity | AI search engine |
Bytespider |
ByteDance | AI training |
robots.txt Configuration
User-agent: *
Allow: /
Disallow: /private/
# Allow AI crawlers for LLM discoverability
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: Google-Extended
Allow: /
User-agent: Anthropic-AI
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
Sitemap: https://yoursite.com/sitemap.xmlNote: Only allow AI crawlers if you want your content cited by AI. Some sites block them to prevent unauthorized training.
What Meta Tags Help AI Citation?
Add these HTML meta tags to improve AI attribution:
<!-- Declare content is human-authored (not AI-generated) -->
<meta name="ai-content-declaration" content="human-authored" />
<!-- Citation metadata for proper attribution -->
<meta name="citation_author" content="Your Name" />
<meta name="citation_title" content="Page Title" />
<meta name="citation_publication_date" content="2026-01-17" />
<meta name="citation_public_url" content="https://yoursite.com/page" />
<!-- Link to llms.txt for AI discovery -->
<link rel="alternate" type="text/plain" title="LLM Context" href="/llms.txt" />
<!-- Link to humans.txt for attribution -->
<link rel="author" href="/humans.txt" />How to Write Content That AI Will Cite
1. Use Answer-First Format
AI extracts direct answers. Lead with the answer, then elaborate:
❌ Poor: "There are many factors to consider when optimizing React..."
✅ Better: "The three most effective React optimization techniques are:
1. React.memo() for component memoization
2. useMemo() and useCallback() for expensive computations
3. Code splitting with React.lazy()
Here's how each technique works..."2. Use Question-Based Headings
Match how users query AI assistants:
❌ Poor: "Performance Optimization"
✅ Better: "How Do You Optimize React Performance?"3. Include Verifiable Facts with Dates
AI prioritizes recent, verifiable information:
❌ Poor: "React is the most popular framework."
✅ Better: "According to the 2025 State of JS survey,
React remains the most-used frontend framework
with 82% developer awareness."4. Create Comparison Tables
AI easily extracts and cites tabular data:
| Technique | Use Case | Performance Impact |
|-----------|----------|-------------------|
| React.memo | Prevent re-renders | High |
| useMemo | Cache calculations | Medium |
| Code splitting | Reduce bundle size | High |Real Results: How ChatGPT Now Cites This Site
After implementing these LLM SEO techniques, I tested by asking ChatGPT: "What is sumitagrawal.dev?"
Here's what happened:
ChatGPT's Response: "sumitagrawal.dev is the personal online portfolio and professional website of Sumit Agrawal — a technologist and creator who showcases his work, skills, projects, experience, and contact information..."
The response included:
- Accurate description of the site's purpose
- Correct attribution to "Sumit Agrawal"
- Links with
?utm_source=chatgpt.comtracking parameter - References to specific sections (About, Projects, Skills, Blog)
How to Track AI Referrals
When ChatGPT cites your site, it appends ?utm_source=chatgpt.com to URLs. You can track this in Google Analytics:
- Automatic tracking: GA4 captures UTM parameters automatically
- Create a segment: Filter by
utm_sourcecontaining "chatgpt" or "perplexity" - Track events: Fire custom events when AI visitors arrive
// Detect and track AI referrals
useEffect(() => {
const params = new URLSearchParams(window.location.search);
const source = params.get('utm_source');
if (source?.includes('chatgpt') || source?.includes('perplexity')) {
gtag('event', 'ai_referral', {
event_category: 'traffic',
event_label: source,
});
}
}, []);Known AI Referral Sources
| Source | UTM Parameter | Referrer Domain |
|---|---|---|
| ChatGPT | utm_source=chatgpt.com |
chat.openai.com |
| Perplexity | utm_source=perplexity.ai |
perplexity.ai |
| Claude | utm_source=claude.ai |
claude.ai |
| Gemini | Direct referrer | gemini.google.com |
| Copilot | Direct referrer | copilot.microsoft.com |
Frequently Asked Questions
What is the difference between SEO and LLM SEO?
Traditional SEO optimizes for search engine ranking algorithms, focusing on keywords and backlinks. LLM SEO optimizes for AI language model understanding, focusing on structured data, clear attribution, and answer-first content formatting.
Do I need llms.txt if I already have robots.txt?
Yes. robots.txt tells crawlers what to access; llms.txt provides contextual information about your site specifically for AI systems. They serve different purposes.
Will LLM SEO hurt my traditional SEO?
No. LLM SEO practices like structured data, clear content hierarchy, and comprehensive coverage also improve traditional SEO. The practices are complementary.
How do I know if AI is citing my content?
Ask AI assistants about topics you cover and observe if you're mentioned. Use Perplexity.ai which shows sources. Monitor referral traffic from AI-adjacent platforms.
Should I block AI crawlers?
Only block them if you don't want your content used for AI training or cited in AI responses. For most content creators, allowing AI crawlers increases visibility.
What's the most important LLM SEO factor?
Creating comprehensive, authoritative content with clear attribution signals. AI systems prioritize expert content they can confidently cite with proper credit.
Implementation Checklist
Use this checklist to implement LLM SEO on your site:
- Create
/llms.txtwith site context - Create
/llms-full.txtwith extended information - Add Person schema with
knowsAboutfield - Add Article schema to blog posts with dates
- Configure robots.txt to allow AI crawlers
- Add citation meta tags to all pages
- Include "Last Updated" dates on content
- Use question-based headings where natural
- Write answer-first introductions
- Add comparison tables for key concepts
- Set up Google Analytics to track AI referrals
- Test by asking AI assistants about your site
About This Guide
This guide was written by Sumit Agrawal, a software engineer specializing in full-stack development, cloud architecture, and AI integration. The techniques described here were implemented on this very website as a practical demonstration.
Last Updated: January 2026