Skip to main content

robots.txt Generator — Block AI Scrapers, Allow Google, Done

Generate robots.txt with templates for common crawlers (Google, Bing, AI scrapers).

  • Runs locally
  • Category Generator
  • Best for Starting from a blank page without committing to the first result.
Group 1
User-agents
Quick add:
Allow paths
Disallow paths
Crawl-delay (seconds, optional)
User-agent: *
Disallow: 

Sitemap: https://example.com/sitemap.xml

What this tool does

A visual robots.txt builder for sites that actually care about who crawls them. Stack any number of User-agent groups — each with its own Allow / Disallow lines and optional Crawl-delay — append one or more Sitemap URLs at the bottom, and the output panel rebuilds the file on every keystroke. Four production presets are bundled: "Allow all" (the safe default for marketing sites), "Block all" (under-construction mode), "Block AI scrapers" (a curated wildcard list of GPTBot, ClaudeBot, Claude-Web, CCBot, PerplexityBot, Google-Extended, anthropic-ai, FacebookBot, applebot-extended, Meta-ExternalAgent, and Bytespider — the ten user-agents that account for ~95% of the training-crawler hits we see in real GSC logs), and "WordPress optimized" (blocks /wp-admin/, /wp-includes/, common search and filter URLs, but keeps admin-ajax.php reachable so plugins keep working). One-click Copy gives you a paste-ready file; Download writes it as robots.txt. 100% client-side — no upload, no signup, no telemetry on the rules you write.

Tool details

Input
Text + Numbers
The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output
Live result + Copy + Download
The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy
Browser-side processing
The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share
No account required
Open the page and use it; whether results survive refresh depends on the tool.
Performance budget
Initial JS <= 18 KB
No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit
Generator · Developer
Category and role tags drive related tools, internal links, and quick fit checks.

How to use

  1. 1. Input

    Paste or drop your content into the tool panel.

  2. 2. Process

    Click the button. All processing is local in your browser.

  3. 3. Copy / Download

    Copy the result or download to disk in one click.

How robots.txt Generator fits into your work

Use it to get a strong first draft, starter asset, or structured output that you can edit before publishing.

Generation jobs

  • Starting from a blank page without committing to the first result.
  • Creating repeatable drafts, names, templates, or placeholder assets.
  • Exploring options before choosing the one that fits the job.

Generation checks

  • Review generated output before it reaches a customer, page, or document.
  • Change defaults when you need a specific brand voice, format, or audience.
  • Keep only the parts that match the real task.

Good next steps

These links move the current task into a more complete workflow.

  1. 1 XML Formatter & Validator Pretty-print, minify, and validate XML in your browser — preserves CDATA, comments, and namespaces. Open
  2. 2 URL Slug Generator Turn any title into a clean URL slug — lowercase, dashes, ASCII-safe transliteration, multiline batch — browser-only Open
  3. 3 Markdown to HTML Convert Markdown to clean HTML — headings, lists, code, links, images, tables — instant live preview, browser-only Open

Real-world use cases

  • Stop training crawlers from scraping a fresh content site

    You launched a 200-post recipe blog and noticed GPTBot and CCBot eating 40% of your bandwidth in the access logs. Pick the "Block AI scrapers" preset, keep "Allow all" for Googlebot and Bingbot, and ship the file. Within a day the well-behaved bots back off and your origin load drops while search crawlers keep indexing.

  • Lock down a staging or under-construction site

    A client demo lives at staging.acme.com and you do not want any of it in Google's index. Use the "Block all" preset, which writes User-agent: * plus Disallow: /, copy it to the staging root, and you are covered in two clicks. When you go live, swap to "Allow all" and re-upload, no hand-editing of paths required.

  • Tune a WordPress site without breaking plugins

    Your WooCommerce store wastes crawl budget on /wp-admin/ and dozens of ?orderby= filter URLs. The "WordPress optimized" preset blocks those paths but keeps admin-ajax.php reachable, so cart and checkout AJAX keep working. Add your Sitemap line at the bottom and you have a clean file that took 30 seconds instead of 20 minutes.

  • Carve out one allowed file inside a blocked folder

    You block /downloads/ from crawlers but want one public whitepaper at /downloads/pricing-2026.pdf to stay indexable. Add a Disallow: /downloads/ line, then an Allow: /downloads/pricing-2026.pdf line in the same group. The live preview shows the exact longest-match behavior Googlebot uses, so you can verify before you upload.

Common pitfalls

  • Putting Disallow: / in the global group by accident and deindexing your whole site. Always re-read the User-agent: * block before uploading, and test live URLs in Search Console.

  • Expecting Disallow to hide a page from search. A disallowed URL can still rank with no snippet if it has backlinks; use a noindex meta tag for pages you truly want out of results.

  • Forgetting the Sitemap line, so smaller crawlers like DuckDuckGo never find it. It costs one line at the bottom of the file and shows up in every SEO audit tool.

Privacy

Everything runs in your browser. The User-agent groups, Allow/Disallow rules, and Sitemap URLs you type are never uploaded, logged, or sent to any server, and the tool has no signup. If you enable URL state to share a config, those rules become part of the link, so avoid pasting internal hostnames you do not want in a shared URL.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Made by Toolora · 100% client-side · Updated 2026-06-14