AI Website Scan

See how a site treats AI crawlers like GPTBot, ClaudeBot and PerplexityBot. We read its robots.txt and llms.txt and show you exactly which AI engines can — and can't — reach it.

Why AI crawler access matters

Generative AI engines — ChatGPT, Claude, Perplexity, Google's AI Overviews — each crawl the web with their own bot. When a user asks one of them a question, your pages can only be quoted or cited if that bot was allowed to read them. Blocking them keeps your content out of training sets, but it also keeps you out of the answers more and more people start their research with.

How the scan works

We fetch your site's robots.txt and resolve the effective rule for every well-known AI crawler — a bot named explicitly wins over the site-wide * rule, exactly as RFC 9309 specifies. We also look for an llms.txt, the emerging convention for handing AI models a curated map of the content you most want them to read. Finally we ask the homepage for text/markdown to see whether the site serves AI agents a clean markdown rendering instead of full HTML.