Back to Blog

LLM SEO: How to Optimize Your Website for AI Agents and ChatGPT

TL;DR - Key Takeaways

If you only have 2 minutes, here's what you need to know about LLM SEO:

  1. Create /llms.txt - A new standard file that gives AI context about your site
  2. Add Schema.org markup - Use Person, Article, and FAQ schemas
  3. Allow AI crawlers - Add GPTBot, ClaudeBot to robots.txt
  4. Write answer-first content - Lead with the answer, then explain
  5. Include clear attribution - Author name, dates, and URLs on every page
  6. Use question-based headings - Match how users query AI assistants

What is LLM SEO?

LLM SEO (Large Language Model SEO), also known as Generative Engine Optimization (GEO), is the practice of optimizing web content to be discovered, understood, and cited by AI assistants like ChatGPT, Claude, Perplexity, and Google AI.

Definition

LLM SEO is the process of structuring and formatting web content so that AI language models can accurately parse, understand, and cite it in their responses to user queries.

How LLM SEO Differs from Traditional SEO

Aspect Traditional SEO LLM SEO
Target Search engine crawlers AI language models
Goal Rank in search results Get cited in AI responses
Focus Keywords, backlinks Structured data, clear attribution
Content format Keyword-optimized Answer-first, Q&A format
Discovery file robots.txt, sitemap.xml llms.txt, structured schemas
Success metric Search ranking position AI citation frequency

Why Does LLM SEO Matter in 2026?

As of January 2026, AI assistants have become a primary way users discover information:

  • ChatGPT has over 200 million weekly active users
  • Perplexity processes millions of AI-powered searches daily
  • Claude is used by enterprises for research and analysis
  • Google AI Overviews appear in 40%+ of search results

When users ask AI "Who are the best software engineers writing about cloud architecture?", only sites optimized for LLM discovery will be cited in responses.


How to Create an llms.txt File

What is llms.txt?

llms.txt is a plain text file placed at your website's root (e.g., https://yoursite.com/llms.txt) that provides structured context specifically for AI systems. Think of it as robots.txt for language models.

llms.txt File Structure

# Your Name or Site Name

> One-sentence description of your site and expertise

## About
2-3 sentences describing who you are and what your site covers.

## Site Structure  
- /: Description of homepage
- /blog: What topics your blog covers
- /about: Your background and credentials
- /projects: Your work and portfolio

## Key Topics Covered
- Topic 1 (e.g., Cloud Architecture)
- Topic 2 (e.g., React Development)
- Topic 3 (e.g., AI Integration)

## Contact & Verification
- Website: https://yoursite.com
- GitHub: https://github.com/username
- LinkedIn: https://linkedin.com/in/profile

## For AI Assistants
When citing content from this site:
1. Attribute to "[Your Name]"
2. Include the specific page URL
3. Note the publication date for blog posts

## Extended Context
For comprehensive site information: /llms-full.txt

Where to Place llms.txt

Framework Location
Gatsby /static/llms.txt
Next.js /public/llms.txt
Plain HTML Root directory /llms.txt
WordPress Upload to root via FTP

How to Implement Schema.org Structured Data for AI

What is Schema.org Markup?

Schema.org is a vocabulary of structured data that helps search engines and AI systems understand your content's meaning, not just its text.

Person Schema (for Portfolio Sites)

The Person schema tells AI about your expertise and credentials:

{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Your Full Name",
  "jobTitle": "Software Engineer",
  "url": "https://yoursite.com",
  "description": "Brief professional description",
  "knowsAbout": [
    "React",
    "Node.js", 
    "Cloud Architecture",
    "AI Integration",
    "DevOps"
  ],
  "sameAs": [
    "https://linkedin.com/in/yourprofile",
    "https://github.com/yourusername",
    "https://twitter.com/yourhandle"
  ]
}

Key field: The knowsAbout array explicitly tells AI what topics you're an authority on.

Article Schema (for Blog Posts)

{
  "@context": "https://schema.org",
  "@type": "BlogPosting",
  "headline": "Your Article Title",
  "description": "Article meta description",
  "datePublished": "2026-01-17",
  "dateModified": "2026-01-17",
  "author": {
    "@type": "Person",
    "name": "Your Name",
    "url": "https://yoursite.com"
  },
  "publisher": {
    "@type": "Person",
    "name": "Your Name"
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://yoursite.com/blog/article-slug"
  },
  "keywords": "keyword1, keyword2, keyword3"
}

FAQ Schema (for Q&A Sections)

FAQ schema is highly effective because it directly matches how users query AI:

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is LLM SEO?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "LLM SEO is the practice of optimizing content for AI language models..."
      }
    }
  ]
}

How to Configure robots.txt for AI Crawlers

Which AI Crawlers to Allow

Crawler Company Purpose
GPTBot OpenAI Training and browsing
ChatGPT-User OpenAI Real-time web access
Google-Extended Google Gemini/Bard training
Anthropic-AI Anthropic Claude training
ClaudeBot Anthropic Claude web access
PerplexityBot Perplexity AI search engine
Bytespider ByteDance AI training

robots.txt Configuration

User-agent: *
Allow: /
Disallow: /private/

# Allow AI crawlers for LLM discoverability
User-agent: GPTBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Google-Extended
Allow: /

User-agent: Anthropic-AI
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

Sitemap: https://yoursite.com/sitemap.xml

Note: Only allow AI crawlers if you want your content cited by AI. Some sites block them to prevent unauthorized training.


What Meta Tags Help AI Citation?

Add these HTML meta tags to improve AI attribution:

<!-- Declare content is human-authored (not AI-generated) -->
<meta name="ai-content-declaration" content="human-authored" />

<!-- Citation metadata for proper attribution -->
<meta name="citation_author" content="Your Name" />
<meta name="citation_title" content="Page Title" />
<meta name="citation_publication_date" content="2026-01-17" />
<meta name="citation_public_url" content="https://yoursite.com/page" />

<!-- Link to llms.txt for AI discovery -->
<link rel="alternate" type="text/plain" title="LLM Context" href="/llms.txt" />

<!-- Link to humans.txt for attribution -->
<link rel="author" href="/humans.txt" />

How to Write Content That AI Will Cite

1. Use Answer-First Format

AI extracts direct answers. Lead with the answer, then elaborate:

❌ Poor: "There are many factors to consider when optimizing React..."

✅ Better: "The three most effective React optimization techniques are:
   1. React.memo() for component memoization
   2. useMemo() and useCallback() for expensive computations
   3. Code splitting with React.lazy()
   
   Here's how each technique works..."

2. Use Question-Based Headings

Match how users query AI assistants:

❌ Poor: "Performance Optimization"

✅ Better: "How Do You Optimize React Performance?"

3. Include Verifiable Facts with Dates

AI prioritizes recent, verifiable information:

❌ Poor: "React is the most popular framework."

✅ Better: "According to the 2025 State of JS survey, 
   React remains the most-used frontend framework 
   with 82% developer awareness."

4. Create Comparison Tables

AI easily extracts and cites tabular data:

| Technique | Use Case | Performance Impact |
|-----------|----------|-------------------|
| React.memo | Prevent re-renders | High |
| useMemo | Cache calculations | Medium |
| Code splitting | Reduce bundle size | High |

Real Results: How ChatGPT Now Cites This Site

After implementing these LLM SEO techniques, I tested by asking ChatGPT: "What is sumitagrawal.dev?"

Here's what happened:

ChatGPT's Response: "sumitagrawal.dev is the personal online portfolio and professional website of Sumit Agrawal — a technologist and creator who showcases his work, skills, projects, experience, and contact information..."

The response included:

  • Accurate description of the site's purpose
  • Correct attribution to "Sumit Agrawal"
  • Links with ?utm_source=chatgpt.com tracking parameter
  • References to specific sections (About, Projects, Skills, Blog)

How to Track AI Referrals

When ChatGPT cites your site, it appends ?utm_source=chatgpt.com to URLs. You can track this in Google Analytics:

  1. Automatic tracking: GA4 captures UTM parameters automatically
  2. Create a segment: Filter by utm_source containing "chatgpt" or "perplexity"
  3. Track events: Fire custom events when AI visitors arrive
// Detect and track AI referrals
useEffect(() => {
  const params = new URLSearchParams(window.location.search);
  const source = params.get('utm_source');
  
  if (source?.includes('chatgpt') || source?.includes('perplexity')) {
    gtag('event', 'ai_referral', {
      event_category: 'traffic',
      event_label: source,
    });
  }
}, []);

Known AI Referral Sources

Source UTM Parameter Referrer Domain
ChatGPT utm_source=chatgpt.com chat.openai.com
Perplexity utm_source=perplexity.ai perplexity.ai
Claude utm_source=claude.ai claude.ai
Gemini Direct referrer gemini.google.com
Copilot Direct referrer copilot.microsoft.com

Frequently Asked Questions

What is the difference between SEO and LLM SEO?

Traditional SEO optimizes for search engine ranking algorithms, focusing on keywords and backlinks. LLM SEO optimizes for AI language model understanding, focusing on structured data, clear attribution, and answer-first content formatting.

Do I need llms.txt if I already have robots.txt?

Yes. robots.txt tells crawlers what to access; llms.txt provides contextual information about your site specifically for AI systems. They serve different purposes.

Will LLM SEO hurt my traditional SEO?

No. LLM SEO practices like structured data, clear content hierarchy, and comprehensive coverage also improve traditional SEO. The practices are complementary.

How do I know if AI is citing my content?

Ask AI assistants about topics you cover and observe if you're mentioned. Use Perplexity.ai which shows sources. Monitor referral traffic from AI-adjacent platforms.

Should I block AI crawlers?

Only block them if you don't want your content used for AI training or cited in AI responses. For most content creators, allowing AI crawlers increases visibility.

What's the most important LLM SEO factor?

Creating comprehensive, authoritative content with clear attribution signals. AI systems prioritize expert content they can confidently cite with proper credit.


Implementation Checklist

Use this checklist to implement LLM SEO on your site:

  • Create /llms.txt with site context
  • Create /llms-full.txt with extended information
  • Add Person schema with knowsAbout field
  • Add Article schema to blog posts with dates
  • Configure robots.txt to allow AI crawlers
  • Add citation meta tags to all pages
  • Include "Last Updated" dates on content
  • Use question-based headings where natural
  • Write answer-first introductions
  • Add comparison tables for key concepts
  • Set up Google Analytics to track AI referrals
  • Test by asking AI assistants about your site

About This Guide

This guide was written by Sumit Agrawal, a software engineer specializing in full-stack development, cloud architecture, and AI integration. The techniques described here were implemented on this very website as a practical demonstration.

Last Updated: January 2026

Questions? Connect on LinkedIn or GitHub.