NeuralCrawl

๐Ÿ‡ซ๐Ÿ‡ท Oncrawl

oncrawl.com · SEO & AI search · rank #34 · SEO crawler · live robots.txt ↗

AI crawler access (latest snapshot, 4h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 744 bytes · sha256 2ee140d5bb36 · raw

User-agent: Googlebot
Disallow: /ads.txt
Disallow: /wpcb/wp-admin/
Disallow: /search/
Disallow: /*&sa=*
Disallow: /&sa=*
Disallow: /?s=*
Disallow: /.well-known*
Disallow: /wpcb/wp-json/contact-form*
Disallow: /wpcb/wp-includes/wlwmanifest.xml
Disallow: /web/app/modules/*
Disallow: /?author=*
Disallow: /*feed/
Disallow: /*hubs*
Allow: /tag/content-hubs/

User-agent: *
Disallow: /ads.txt
Disallow: /wpcb/wp-admin/
Disallow: /search/
Disallow: /*&sa=*
Disallow: /&sa=*
Disallow: /?s=*
Disallow: /.well-known*
Disallow: /wpcb/wp-json/contact-form*
Disallow: /wpcb/wp-includes/wlwmanifest.xml
Disallow: /web/app/modules/*
Disallow: /?author=*


Sitemap: https://www.oncrawl.com/sitemap_index.xml
Sitemap: https://fr.oncrawl.com/sitemap_index.xml

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived