NeuralCrawl

πŸ‡ΊπŸ‡Έ Marvell Technology

marvell.com · Top 1000 websites · rank #865 · Semiconductors · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 616 bytes · sha256 a40e3a42bd0f · raw

#Robot exclusion

User-agent: *

Disallow: /assets
Disallow: /cn
Disallow: /drivers
Disallow: /error
Disallow: /form
Disallow: /go
Disallow: /guide
Disallow: /images
Disallow: /includes
Disallow: /inc
Disallow: /survey
Disallow: /w3c
Disallow: /newsroom-archive/
Disallow: /company/newsroom-archive/
Disallow: /products/security-solutions/nitrox-hs-adapters/
Disallow: /content/dam/marvell/en/company/assets/marvell-billing-guidelines.pdf
Disallow: /content/dam/marvell/en/company/career/documents/marvell-talent-community-privacy-notice.pdf
Sitemap: https://www.marvell.com/content/marvell-com/en/home.sitemap.xml

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived