Skip to content
SEOSpot
Technical SEO

Robots.txt

Also called Robots exclusion protocol, Robots exclusion standard.

Short definition

A robots.txt file is a plain-text file at a domain's root that tells search engine crawlers which pages or directories they are and aren't allowed to crawl.

What it means

Robots.txt is the first thing a well-behaved search engine crawler checks when it visits a site. The file uses a simple syntax of Allow, Disallow, and User-agent rules to communicate crawl permissions. A single `Disallow: /` blocks all crawlers from the entire site. `Disallow: /admin/` blocks only the admin directory. Crawlers that don't respect robots.txt at all — including some AI scrapers — are not 'well-behaved' by this standard.

The most important misunderstanding about robots.txt is what it does and doesn't control. It governs crawling, not indexing. A page blocked by robots.txt can still appear in search results if other pages link to it — Google can index a URL it's never crawled based on link signals alone. To prevent indexation, you need a noindex meta tag on the page itself, which requires the page to be crawled and rendered first.

The most damaging robots.txt mistake is accidentally blocking the wrong things. Mis-formatting a Disallow rule, missing a trailing slash, or deploying a staging `Disallow: /` rule to production has caused measurable traffic losses even at large sites. Robots.txt should be monitored as part of routine technical SEO, especially around deployment events.

Key takeaways

  • Robots.txt controls crawling, not indexation — a blocked page can still be indexed via links
  • Incorrect configuration (especially Disallow: /) has caused significant traffic losses at real sites
  • Google treats robots.txt as a strong signal but isn't strictly obligated to follow it
  • Test all robots.txt changes with Google's Robots Testing Tool before deploying
Also called
  • Robots exclusion protocol
  • Robots exclusion standard

Last updated . Spotted something wrong? Let us know.

Beyond definitions

Knowing what Robots.txt is, is the easy part.

Implementing it on your site is what moves the needle. Get a free SEO audit and we’ll show you where robots.txt fits in your roadmap.

Free2–3 business day deliveryNo follow-up sales pressure