NeuralCrawl

๐Ÿ‡จ๐Ÿ‡ณ JD.com

jd.com · E-commerce · rank #4 · E-commerce · live robots.txt ↗

AI crawler access (latest snapshot, 3h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 787 bytes · sha256 71185ce79642 · raw

User-agent: *
Disallow: /*?

User-agent: Googlebot
Allow: /*?

User-agent: Googlebot-Image
Allow: /*?

User-agent: Googlebot-Mobile
Allow: /*?

User-agent: Googlebot-Video
Allow: /*?

User-agent: Googlebot-News
Allow: /*?

User-agent: Bingbot
Allow: /*?

User-agent: msnbot
Allow: /*?

User-agent: bingbot-mobile
Allow: /*?

User-agent: Baiduspider
Allow: /*?

User-agent: Baiduspider-image
Allow: /*?

User-agent: Baiduspider-video
Allow: /*?

User-agent: YandexBot
Allow: /*?

User-agent: YandexBot-Image
Allow: /*?

User-agent: YandexBot-Mobile
Allow: /*?

User-agent: Sogou web spider
Allow: /*?

User-agent: Sogou Pic Spider
Allow: /*?

User-agent: 360Spider
Allow: /*?

User-agent: HaosouSpider
Allow: /*?

User-agent: Sosoimagespider
Allow: /*?

User-agent: Sosospider
Allow: /*?

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived