← Back to Learn

What AI Crawlers Are

The bots that decide if AI can read your site

What is it?

Think of AI crawlers as scouts. Before ChatGPT or Claude can recommend your business, they need to know you exist. That's where crawlers come in - automated bots that visit your website, read your content, and report back to the AI system.

The main players you should know about are GPTBot (OpenAI's crawler for ChatGPT) and ClaudeBot (Anthropic's crawler for Claude). There are others, but these two are the big ones driving AI recommendations right now.

Here's the catch: if you accidentally block these crawlers, AI systems literally can't see your website. It's like locking your front door and wondering why customers aren't coming in.

Why it matters for your business

We've seen businesses with great websites get zero AI visibility because someone checked a box in their site settings that said "block all bots." They were trying to prevent spam bots, but they also locked out the AI crawlers that would have put them on the map.

Real example: A marketing agency had amazing case studies and testimonials, but their AI citation rate was 0%. Not low - literally zero.

We checked their robots.txt file and found they were blocking all crawlers except Google. One line change to allow GPTBot and ClaudeBot, and within two weeks they started appearing in AI responses. Citation rate jumped to 28%.

The flip side is also true: if you specifically allow these crawlers, you're giving AI permission to learn about your business and recommend you. It's a simple yes/no decision that has huge impact.

The technical details (for the curious)

Your website has a file called robots.txt that lives at the root of your domain (like yoursite.com/robots.txt). This file tells web crawlers what they're allowed to access.

Here's what a typical robots.txt file looks like:

User-agent: *
Disallow: /admin/
Disallow: /private/

User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

Let's break that down:

  • User-agent: * means "for all bots"
  • Disallow: /admin/ blocks access to your admin pages (good for security)
  • User-agent: GPTBot specifically addresses OpenAI's crawler
  • Allow: / gives permission to crawl your whole site

If you want to block AI crawlers (maybe you have proprietary content), you'd add:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

But for most businesses trying to get discovered, you want to allow these crawlers, not block them.

Quick check

You can view your robots.txt file right now by going to yourdomain.com/robots.txt in your browser. If you see GPTBot or ClaudeBot with "Disallow", you're blocking AI. If you don't see them mentioned at all, they're allowed by default (which is good).

Useful links

See how your business performs on this metric.

Check Your Visibility