NeuralCrawl

πŸ‡ΊπŸ‡Έ Similarweb

similarweb.com · SEO & AI search · rank #3 · Web intelligence · live robots.txt ↗

AI crawler access (latest snapshot, 4h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 1026 bytes · sha256 f8b547fa4a96 · raw

User-agent: *
Allow: /corp/search*
Allow: /corp/*/search*
Disallow: */search/*
Disallow: */adult/*
Disallow: /corp/*.pdf$
Disallow: /corp/solution/
Disallow: /corp/lps/
Disallow: /corp/get-data/
Disallow: /corp/unlock-growth/
Disallow: /silent-login/
Disallow: /signin-oidc/
Disallow: /signout-oidc/
#LLMs-txt: https://www.similarweb.com/llms.txt
Sitemap: https://www.similarweb.com/corp/sitemap_index.xml
Sitemap: https://www.similarweb.com/blog/sitemap_index.xml
Sitemap: https://www.similarweb.com/sitemaps/sitemap_index.xml.gz
#
#                    sMMMMMMMMs
#                 MNdmMh+-``.:ohNM
#               MNy/   .sMd-   `/yNM
#             MNo`    sMo        .oNM
#            Md -    sMo           -dM
#           'MM+      -dMm+.        yMM'
#            MN`       `:yNMd/      .NM
#             MN-         `-hMy    :MM
#              MMN-        sMd`   -NM
#               Md:`     `sh``  .sMM
#                 Mdms+-.```-hmMds
#                    oMMMMMMMMo
#
#       OFFICIAL MEASURE OF THE DIGITAL WORLD
#

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived