How Claude retrieves and cites sources
Learning how to get cited by Claude starts with understanding that you need content Anthropic's web search can fetch, parse into a clean factual claim, and attribute back to your URL. Claude is not a live search index of its own; when a user asks something current, Claude issues a web search, reads the top results, and quotes the passages that most directly answer the question. Get the answer-first structure right and your page becomes the snippet Claude shows with a clickable citation.
Claude's retrieval pipeline runs in three stages. First, ClaudeBot crawls and caches public web pages to inform training and, increasingly, grounding. Second, when web search is enabled, Claude fetches live pages through Claude-User (the on-demand fetch agent) at query time. Third, Claude synthesizes an answer and renders inline citations linking to the source pages it relied on.
The practical takeaway: two different agents touch your site. The training/caching crawler and the real-time fetcher each respect their own robots.txt directives. Block either one and you remove yourself from the candidate pool. Allow both, and you become eligible to be quoted.
Citations are awarded to the page that states the answer most clearly and most factually — not the page with the most backlinks.
Step 1: Allow ClaudeBot and Claude-User to crawl
Allowing ClaudeBot to crawl your site is the non-negotiable first step to getting cited by Claude. If your robots.txt disallows Anthropic's agents, your page can never enter Claude's retrieval set, no matter how good the content is. Many sites block AI crawlers by default through a CDN setting or a blanket User-agent: * rule and never realize they have opted out of AI search entirely.
Anthropic uses three distinct user agents you should explicitly allow:
- Claude-User — the real-time fetcher triggered when a user's prompt needs live web data.
- Claude-SearchBot — indexes pages to power Claude's search results.
Add this to your robots.txt to explicitly welcome all three:
User-agent: ClaudeBot
Allow: /
User-agent: Claude-User
Allow: /
User-agent: Claude-SearchBot
Allow: /Then confirm nothing upstream is silently blocking them. Cloudflare, Vercel, and Fastly all ship managed bot-blocking rules that catch AI crawlers. For a full agent-by-agent reference, see our AI crawler allowlist guide, and run a free SEO + GEO audit to detect blocked AI bots automatically.
- Allow the crawlersrobots.txt explicitly allows ClaudeBot, Claude-User, and Claude-SearchBot.
- Claude fetches at query timeA user's prompt triggers a web search and Claude-User fetches your live page.
- Claude parses passagesClaude scores each paragraph on how directly and independently it answers the prompt.
- Trust checkE-E-A-T signals — author, dates, publisher — decide whether the page is quotable.
- Answer-first paragraph winsThe clearest standalone factual paragraph becomes the quoted passage.
- Inline citation renderedClaude links the answer back to your URL as a clickable source.
Step 2: Write answer-first, standalone content
Answer-first content gets cited by Claude because Claude lifts the single paragraph that resolves a question without needing surrounding context. When Claude reads your page during a web search, it scores each passage on how completely and independently it answers the user's prompt. A paragraph that opens with the direct answer, names its subject explicitly, and avoids pronouns like "this" or "it" is far more quotable than a paragraph that builds toward a conclusion.
Structure every key section so the first sentence is the answer and the rest is support. Lead with the entity name, state the fact, then qualify. This is the island test: each paragraph should make sense if it were the only thing Claude pasted into its response. We break down the technique in the island test for GEO.
Three rules make a paragraph standalone:
- State one verifiable fact per paragraph, with a number or named entity where possible.
- Avoid forward references — no "as we'll see below" or "the following."
Add structured data so Claude can disambiguate your claims. Valid Article, FAQPage, and Author JSON-LD tells the model who wrote the page and when. Our JSON-LD required fields guide covers the schema Claude and other engines actually parse.
Step 3: Prove E-E-A-T so Claude trusts the source
Claude weighs E-E-A-T signals when deciding which sources to cite, favoring pages with a named author, real publisher identity, and verifiable expertise. Experience, Expertise, Authoritativeness, and Trust are not just Google ranking concepts — Anthropic trains Claude to prefer trustworthy, attributable sources and to avoid quoting anonymous or low-quality pages. A page with a visible byline, an author bio, and a clear publish date outcompetes an identical page with none of those.
Make trust signals machine-readable and human-visible at the same time:
- Dates: visible
datePublishedanddateModifiedso Claude can judge freshness. - Citations: link out to primary sources; Claude rewards pages that themselves cite evidence.
- Publisher identity: an
Organizationblock with a logo and contact path.
Freshness matters more for Claude than for static search. Because Claude prefers current information when answering time-sensitive queries, keep an honest dateModified and update facts when they change. Read what E-E-A-T means in SEO for the full checklist, and check your pages with our E-E-A-T author check.
Is optimizing for Claude different from ChatGPT and Perplexity?
Optimizing for Claude shares a core with ChatGPT and Perplexity — answer-first, standalone, well-structured content wins everywhere — but the crawler names and citation behavior differ. Each engine uses its own user agents, so an allowlist that opens the door to Claude does nothing for OpenAI's GPTBot or Perplexity's PerplexityBot. The content principles transfer; the technical allowlist does not.
| Engine | Crawler / fetcher | Cites sources? | Best lever |
|---|---|---|---|
| Claude | ClaudeBot, Claude-User, Claude-SearchBot | Yes, with web search on | Answer-first + E-E-A-T |
| ChatGPT | GPTBot, OAI-SearchBot | Yes, in search mode | Structured, current content |
| Perplexity | PerplexityBot, Perplexity-User | Yes, always | Concise factual passages |
The biggest practical difference is the crawler matrix. Claude uses ClaudeBot and Claude-User; ChatGPT uses GPTBot and OAI-SearchBot; Perplexity uses PerplexityBot and Perplexity-User. Allow all of them explicitly in robots.txt rather than relying on a wildcard. For engine-specific playbooks, see how to rank in ChatGPT and how to get cited by Perplexity.
Zoom out and the unifying discipline is generative engine optimization — structuring content so any LLM can retrieve and quote it. Our generative engine optimization pillar ties the tactics together across Claude, ChatGPT, Perplexity, and Google AI Overviews.