NeuralCrawl

๐Ÿ‡ฉ๐Ÿ‡ช RWE

rwe.com · European companies · rank #61 · Utilities · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 1304 bytes · sha256 6a84340a17f8 · raw

Sitemap: https://www.rwe.com/sitemap.xml

User-agent: AhrefsBot
Disallow: /
 
User-agent: MJ12bot
Disallow: /
 
User-agent: SemrushBot
Disallow: /
 
User-agent: YisouSpider
Disallow: /
 
User-agent: XoviBot
Disallow: /
 
User-agent: AdmantxBot
Disallow: /
 
User-agent: Proximic
Disallow: /
 
User-agent: WordPress
Disallow: /wp-admin/
 
User-agent: SISTRIX Crawler
Disallow: /

User-agent: DataForSeoBot
Disallow: /

User-Agent: thinkers-bot
Disallow: /

User-Agent: SiteimproveBot
Disallow: /

User-Agent: SiteimproveBot-Crawler
Disallow: /

User-agent: ALittle Client
Disallow:/

User-agent: *
# directory
Disallow: /App_Browsers/
Disallow: /App_Config/
Disallow: /App_Start/
Disallow: /Areas/
Disallow: /Artifacts/
Disallow: /bin/
Disallow: /Content/
Disallow: /layouts/
Disallow: /Properties/
Disallow: /Scripts/
Disallow: /sitecore/
Disallow: /sitecore modules/
Disallow: /sitecore_files/
Disallow: /temp/
Disallow: /upload/
Disallow: /Views/
Disallow: /xsl/
Disallow: */sitecore/Inhalt/*
Disallow: */sitecore/Content/*

# files
Disallow: /Default.aspx
Disallow: /default.css
Disallow: /default.htm.sitedown
Disallow: /default.js
Disallow: /Web.config.install.xdt
Disallow: /Web.config.uninstall.xdt
Disallow: /webedit.css
Disallow: /Global.asax
Disallow: /lb.txt
Disallow: /web.config


Allow: /

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived