NeuralCrawl

πŸ‡ΊπŸ‡Έ Semantic Scholar

semanticscholar.org · Academic & open research · rank #6 · Academic search · live robots.txt ↗

AI crawler access (latest snapshot, 3h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 574 bytes · sha256 c91d2f860bd8 · raw

# We are a non-profit research institute.  If you would like to collaborate with us,
# please contact us at: [email protected]
# Or check out our public API http://api.semanticscholar.org/
User-agent: *
Disallow: /search
Disallow: /error
Disallow: /me
Disallow: /api
Disallow: /author/*/claim
Disallow: /author/*?
Disallow: /paper/*?
Disallow: /reader/
Allow: /paper/*?p2df

Sitemap: https://www.semanticscholar.org/sitemap_author_index.xml
Sitemap: https://www.semanticscholar.org/sitemap_paper_index.xml
Sitemap: https://www.semanticscholar.org/sitemap_topic_index.xml

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived