AI TrainerOpenAI
Last verified: 2026-06-13 · maintained by Unsourced
GPTBot is OpenAI's crawler for gathering training data used to improve future GPT models. It honours robots.txt. Blocking it can keep your content out of what newer models learn — which may reduce how often they reference you.
GPTBot publishes the IP ranges it crawls from, from the operator's published feed. A request claiming to be GPTBot from an IP outside these ranges is suspect:
A User-Agent string is just a claim — anyone can send GPTBot in a header. Confirm identity two ways:
forward-confirmed reverse DNS (the IP resolves to OpenAI, and that host resolves back to the IP), and, where published,
an IP inside the operator's official ranges. If neither holds, it's an impostor wearing the badge — not GPTBot.
Recommended: keep. Feeds model training; blocking it can quietly remove you from future AI answers.
If you do choose to act in robots.txt (which crawlers honour but don't enforce):
# GPTBot: recommended to ALLOW — blocking can cost you AI visibility User-agent: GPTBot Disallow:
Is GPTBot really from OpenAI?
GPTBot is OpenAI's crawler, but a User-Agent header can be spoofed, so the claim alone isn't proof. Confirm it with forward-confirmed reverse DNS and, where published, a match against OpenAI's official IP ranges.
What are GPTBot's IP ranges?
OpenAI publishes the IP ranges GPTBot crawls from, via the operator's live published feed. This page lists the current 21. A request claiming to be GPTBot from an IP outside those ranges is suspect.
Should I block GPTBot?
Our recommendation: keep. Feeds model training; blocking it can quietly remove you from future AI answers.
Unsourced verifies every AI crawler against published ranges and reverse DNS, and shows which AI assistants cite you.
Check your site free →14-day free trial · no card required · cancel anytime