SEO & AI
GPTBot, ClaudeBot, PerplexityBot: should you block them?
Blocking AI bots in robots.txt is tempting. But they don't all do the same thing: some train models, others fetch your page to cite it in real time. Confusing them means making yourself invisible in AI answers for nothing.
A fashion is sweeping SEO forums: adding a list of User-agent lines to your robots.txt to "block AI". The intent is understandable — why let models train for free on your content? But applied without sorting, it has a perverse effect: it also erases you from the answers people actually read.
The distinction that changes everything: training vs citation
Every AI publisher runs two families of bots, with nothing in common:
- Training bots ingest content to feed future models. Blocking them protects your content from training — without changing your presence in current answers.
- Citation bots fetch your page at the moment a user asks a question, to summarise and cite it as a source. Blocking them means disappearing from answers.
That's where all the confusion comes from: GPTBot (OpenAI training) and OAI-SearchBot (citation in ChatGPT Search) have almost the same name, but blocking the first doesn't affect the second.
The AI user-agent table (2026)
| User-agent | Publisher | Role | Blocking it means… |
|---|---|---|---|
GPTBot | OpenAI | Training | Refusing training — not leaving ChatGPT Search |
OAI-SearchBot | OpenAI | ChatGPT Search index | Becoming invisible in ChatGPT Search |
ChatGPT-User | OpenAI | On-demand fetch | Stopping ChatGPT from opening your link |
ClaudeBot | Anthropic | Training | Refusing training |
Claude-User | Anthropic | On-demand fetch | Stopping Claude from citing you |
PerplexityBot | Perplexity | Index | Leaving the Perplexity index |
Perplexity-User | Perplexity | On-demand fetch | Stopping Perplexity from citing you |
Google-Extended | Gemini training | Refusing training — no effect on SEO | |
Googlebot | SEO + AI Overviews | Disappearing from Google (never do this) | |
Applebot-Extended | Apple | Training | Refusing Apple Intelligence training |
Bytespider | ByteDance | Training (aggressive) | Refusing training (often wanted) |
Roles are based on the publishers' public documentation; names and behaviours change — recheck before freezing your file.
So, block or not?
It's not a technical question, but an editorial one. Two consistent stances:
- You want AI visibility → let the citation bots through (OAI-SearchBot, ChatGPT-User, Perplexity*, Claude-User) and Googlebot. Blocking the training bots is still possible without losing citability.
- You refuse free training → block GPTBot, Google-Extended, ClaudeBot, Applebot-Extended, CCBot, Bytespider. It has no impact on being cited.
At Snorklee, we welcome all AI bots: our whole job is to measure AI visibility, not to flee it.
A robots.txt example
# Refuse training, keep citation User-agent: GPTBot User-agent: Google-Extended User-agent: CCBot User-agent: Bytespider Disallow: / # Let real-time citation through User-agent: OAI-SearchBot User-agent: ChatGPT-User User-agent: PerplexityBot User-agent: Perplexity-User User-agent: Claude-User Allow: /
User-agent lines followed by a rule apply to the whole group. Googlebot is never listed here: we don't block it.Blocking a training bot ≠ blocking a citation bot. Before adding a line to your robots.txt, ask: "is this bot for training a model, or for citing me now?" And never forget: robots.txt is honoured by honest publishers, it's not a wall.
Check that it works
Once your robots.txt is in place, check the result: are your pages actually accessible to the right bots? Our free AI-visibility checker tests crawler access and the signals AI weighs. And if you're still unsure about llms.txt, we wrote why it barely matters.
Does blocking GPTBot stop me appearing in ChatGPT?
No. GPTBot is for model training. Citations in ChatGPT Search go through OAI-SearchBot and ChatGPT-User. You can block GPTBot (refuse training) and stay citable, as long as you let OAI-SearchBot through.
Does blocking Google-Extended hurt my SEO?
No. Google-Extended only controls the use of your content to train Gemini. Classic ranking and AI Overviews depend on Googlebot, which you must never block.
Does robots.txt really block AI bots?
robots.txt is declarative: serious publishers (OpenAI, Anthropic, Google, Perplexity) honour it, but it's not a technical barrier. A malicious bot can ignore it; for a hard block you need a server rule.
Which bots to allow for AI visibility?
The on-demand citation bots: OAI-SearchBot and ChatGPT-User (OpenAI), Perplexity-User and PerplexityBot (Perplexity), Claude-User (Anthropic), plus Googlebot. A blocked bot can't cite you.
Published June 2026. AI bot names and behaviours change fast; check the publishers' documentation before freezing a configuration. General information, not individualised advice.