ScraperDiffbot
Last verified: 2026-06-13 · maintained by Unsourced
Diffbot extracts structured data from pages for its knowledge graph, sold on to third parties. It offers no direct citation or referral benefit to the sites it crawls.
There is no published IP-range feed or documented reverse-DNS footprint for Diffbot, so its identity can't be verified by IP. Treat the User-Agent as an unverified claim.
A User-Agent string is just a claim — anyone can send Diffbot in a header. Confirm identity two ways:
forward-confirmed reverse DNS (the IP resolves to Diffbot, and that host resolves back to the IP), and, where published,
an IP inside the operator's official ranges. If neither holds, it's an impostor wearing the badge — not Diffbot.
Recommended: block. Bulk extraction with no citation or referral benefit — reasonable to block.
If you do choose to act in robots.txt (which crawlers honour but don't enforce):
User-agent: Diffbot Disallow: /
Is Diffbot really from Diffbot?
Diffbot is Diffbot's crawler, but a User-Agent header can be spoofed, so the claim alone isn't proof. Confirm it with forward-confirmed reverse DNS and, where published, a match against Diffbot's official IP ranges.
What are Diffbot's IP ranges?
There's no published IP-range feed or documented reverse-DNS footprint for Diffbot, so it can't be verified by IP. Treat its User-Agent as an unverified claim.
Should I block Diffbot?
Our recommendation: block. Bulk extraction with no citation or referral benefit — reasonable to block.
Unsourced verifies every AI crawler against published ranges and reverse DNS, and shows which AI assistants cite you.
Check your site free →14-day free trial · no card required · cancel anytime