Robots.txt Generator
robots.txt file controls web crawler access to site pages. Format: User-agent: * (all bots) or Googlebot. Disallow: /private/ (block path), Disallow: /file.html (block file), Allow: /public/ (allow path). Sitemap: https://example.com/sitemap.xml. Example: User-agent: *, Disallow: /admin/, Disallow: /cgi-bin/, Allow: /public/. Place at site root (example.com/robots.txt). Wildcards: * (any chars), $ (end of URL). Test with Google Search Console Robots Testing Tool. Does not guarantee privacy—use authentication for sensitive content. Essential for SEO, preventing duplicate content indexing.
Generate robots.txt files visually. Add user-agent rules for Googlebot, Bingbot, AI crawlers, and custom bots. Set allow/disallow paths, crawl-delay, and sitemap URL. Includes presets for allow all, block all, block AI crawlers, and standard websites. Live preview with validation warnings.
Quick Presets
User-Agent Rules
Sitemap URL
Generated robots.txt
User-agent: * Disallow:
Common Bot Reference
How to Use
- Enter your value in the input field
- Click the Calculate/Convert button
- Copy the result to your clipboard
Frequently Asked Questions
- What is a robots.txt file?
- A robots.txt file is a plain text file placed at the root of a website (e.g., example.com/robots.txt) that tells search engine crawlers and other bots which pages or sections they are allowed or not allowed to access. It follows the Robots Exclusion Protocol standard. While well-behaved bots respect robots.txt, it is not a security mechanism — it is a suggestion, not an enforcement.
- How do I block AI crawlers like GPTBot and CCBot?
- Add specific User-agent rules with "Disallow: /" for each AI crawler. Common AI bots to block: GPTBot (OpenAI), ChatGPT-User, Google-Extended (Google AI training), CCBot (Common Crawl), anthropic-ai, ClaudeBot (Anthropic), and Bytespider (ByteDance). Use the "Block AI Crawlers" preset in our generator for a quick setup.
- What is the difference between Allow and Disallow in robots.txt?
- Disallow tells bots not to crawl a specific path. Allow overrides a Disallow for a more specific path. For example, "Disallow: /private/" blocks everything under /private/, but adding "Allow: /private/public-page" lets bots access that one page. An empty Disallow (Disallow:) means everything is allowed.
- What does Crawl-delay do in robots.txt?
- Crawl-delay tells bots to wait a specified number of seconds between requests. For example, "Crawl-delay: 10" asks bots to wait 10 seconds between page fetches. This helps reduce server load. Note: Googlebot ignores Crawl-delay (use Google Search Console instead), but Bingbot and others respect it.
- Where do I put the robots.txt file?
- The robots.txt file must be placed at the root of your domain, accessible at https://yourdomain.com/robots.txt. It only applies to that specific domain and protocol — a robots.txt on example.com does not apply to subdomain.example.com. Each subdomain needs its own robots.txt file.
- Does robots.txt affect SEO?
- Yes, robots.txt directly impacts SEO. Blocking important pages prevents search engines from crawling and indexing them, making them invisible in search results. However, blocking low-value pages (admin panels, duplicate content, internal search results) can help search engines focus their crawl budget on your important content. Always ensure your key pages are accessible.
- Can I use robots.txt to remove pages from Google?
- Not directly. Blocking a page in robots.txt prevents crawling but does not remove it from search results if Google already knows about it. Google may still show the URL (without a snippet) in results. To remove a page, use the "noindex" meta tag or X-Robots-Tag header, then allow crawling so Google can see the noindex directive.
- What is the Sitemap directive in robots.txt?
- The Sitemap directive tells search engines where to find your XML sitemap. Add "Sitemap: https://example.com/sitemap.xml" at the end of your robots.txt. You can list multiple sitemaps. This helps search engines discover all your pages efficiently, especially for large sites.