NeuralCrawl

πŸ‡ΊπŸ‡Έ Meta AI

ai.meta.com · Top 1000 websites · rank #4 · AI Chatbots and Tools · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 860 bytes · sha256 6f3474864249 · raw

# Notice: Collection of data on Facebook through automated means is
# prohibited unless you have express written permission from Facebook
# and may only be conducted for the limited purpose contained in said
# permission.
# See: http://www.facebook.com/apps/site_scraping_tos_terms.php

User-agent: facebookexternalhit
Allow: *

User-agent: meta-externalads
Allow: *

User-agent: *
Disallow: /*cursor=
Disallow: /*fb_comment_id=
Disallow: /ajax/
Disallow: /tealium/
Disallow: /intern/
Disallow: /internal/
Disallow: /login/
Disallow: /oidc/callback/
Disallow: /*.php

User-agent: PetalBot
Disallow: /

User-agent: Scrapy
Disallow: /

User-agent: uptimerobot
Disallow: /

User-agent: viberbot
Disallow: /

User-agent: YaK
Disallow: /

User-agent: Yandex
Disallow: /

User-agent: Yeti
Disallow: /

Sitemap: https://ai.meta.com/sitemap/ai_meta_com_sitemap.xml.gz

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived