AI TrainerOpenAI

GPTBot: IP ranges, verification & how to handle it

Last verified: 2026-07-28 · maintained by Unsourced

GPTBot is OpenAI's crawler for gathering training data used to improve future GPT models. It honours robots.txt. Blocking it can keep your content out of what newer models learn — which may reduce how often they reference you.

What GPTBot does, and how it differs

GPTBot's fetches feed the pre-training corpus behind OpenAI's next-generation models, so the content it collects shapes what those models know rather than what they quote back today. It crawls broadly on its own schedule, declares itself plainly in the User-Agent, and obeys a GPTBot disallow line in robots.txt. Allowing it is a bet on future recall: pages it never reads simply aren't part of what later GPT models can draw on.

How to verify GPTBot

GPTBot can be checked two independent ways, and a real request passes both. Its source IP has to fall inside the 21 ranges OpenAI publishes, and it has to forward-confirm by reverse DNS: the IP resolves to openai.com, and that hostname resolves back to the same IP. Anything carrying the GPTBot user-agent that fails either test is an impostor, whatever the header says.

Should you allow or block GPTBot?

Recommended: keep. GPTBot's reach is into future models, not today's traffic — allow it and your pages can shape what the next GPT generation knows; disallow it and you opt out of that recall, with no effect on live search.

If you do choose to act in robots.txt (which crawlers honour but don't enforce):

# GPTBot: recommended to ALLOW — blocking can cost you AI visibility
User-agent: GPTBot
Disallow:

Official sources

Published IP feed: https://openai.com/gptbot.json
OpenAI crawler info: https://platform.openai.com/docs/bots ↗

Common questions about GPTBot

Will blocking GPTBot remove me from ChatGPT's answers?

No — ChatGPT's live answers come from its search crawlers, not GPTBot. Blocking GPTBot only keeps your pages out of the training data behind future OpenAI models, which over time can soften how well they recall you.

Does GPTBot respect robots.txt?

Yes. A User-agent: GPTBot disallow is honoured, so one robots.txt line opts you out cleanly — no firewall required. To confirm a request really is GPTBot, match its IP to OpenAI's published ranges and reverse DNS to openai.com.

Is GPTBot the same as OAI-SearchBot?

No. GPTBot collects training data; OAI-SearchBot indexes pages for ChatGPT Search and can cite you with a link. You can allow OAI-SearchBot while disallowing GPTBot to stay citable without feeding training.

Related crawlers

Is GPTBot — or an impostor wearing its name — actually hitting your site?

Unsourced checks every crawler against operator-published ranges and forward-confirmed reverse DNS, and shows which AI assistants cite you.

Check your site free →

10-day free trial · no card required · cancel anytime