NeuralCrawl

πŸ‡ΊπŸ‡Έ Penguin Random House

penguinrandomhouse.com · Top 1000 websites · rank #593 · Publishing · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 321 bytes · sha256 8a2381e3080a · raw

User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /trackback/
Disallow: /search/
Disallow: /prh-internal-news/
Disallow: /interactive/reading-preference
Allow: /wp-admin/admin-ajax.php
Allow: /wp-includes/css/dist/block-library/style.min.css

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived