robots.txt Tester

Fetch a site's robots.txt and check whether a specific URL is allowed or blocked for a crawler. Leave the path blank to just view the rules.

What is robots.txt?

robots.txt is a plain-text file at the root of a site (example.com/robots.txt) that tells crawlers which parts of the site they may or may not request. Each group starts with one or more User-agent lines followed by Allow and Disallow rules.

How matching works

A crawler uses the most specific group whose user-agent it matches, falling back to *. Within that group the longest matching rule wins, and an Allow beats a Disallow of equal length. Patterns may use * to match any sequence and $ to anchor the end of the URL. Note that robots.txt is advisory — well-behaved crawlers obey it, but it does not enforce access control.