NeuralCrawl

🇫🇷 TotalEnergies

totalenergies.com · European companies · rank #3 · Energy · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 4248 bytes · sha256 3e614c1c032f · raw

#
# robots.txt
#
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
#
# This file will be ignored unless it is at the root of your host:
# Used:    http://example.com/robots.txt
# Ignored: http://example.com/site/robots.txt
#
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/robotstxt.html

User-agent: *
# CSS, JS, Images
Allow: /core/*.css$
Allow: /core/*.css?
Allow: /core/*.js$
Allow: /core/*.js?
Allow: /core/*.gif
Allow: /core/*.jpg
Allow: /core/*.jpeg
Allow: /core/*.png
Allow: /core/*.svg
Allow: /profiles/*.css$
Allow: /profiles/*.css?
Allow: /profiles/*.js$
Allow: /profiles/*.js?
Allow: /profiles/*.gif
Allow: /profiles/*.jpg
Allow: /profiles/*.jpeg
Allow: /profiles/*.png
Allow: /profiles/*.svg
# Directories
Disallow: /core/
Disallow: /profiles/
# Files
Disallow: /README.txt
Disallow: /web.config
# Paths (clean URLs)
Disallow: /comment/reply/
Disallow: /filter/tips
Disallow: /node/add/
Allow: /search/
Allow: /recherche/
# Paths (no clean URLs)
Disallow: /index.php/comment/reply/
Disallow: /index.php/filter/tips
Disallow: /index.php/node/add/
Disallow: /index.php/search/
Disallow: /index.php/user/logout/
# Search (with parameters)
Allow: /search/content?*
Allow: /recherche/contenu?*
Allow: */medias/actualites*page=*
Allow: */media/news*page=*
Allow: */medias/medias*page=*
Allow: */media/media*page=*
Disallow: */formulaire-de-contact/*
Disallow: */contact-form/*
Disallow: /search-content*
Disallow: */recherche-contenu*
# Integration (static files)
Disallow: /themes/custom/*/integration/
Disallow: */contenu/publications/*

# -------------------------  
# Newsroom  
# -------------------------  
# 1) Newsroom home (FR/EN) – allow only lang=fra or lang=eng  
Allow: */newsroom/fr/?lang=fra  
Allow: */newsroom/en/?lang=eng  
Disallow: */newsroom/fr/?  
Disallow: */newsroom/en/?  
  
# 2) Communiqués de presse – allow only all-themes (+ pagination)  
Allow: */newsroom/section/communiques-de-presse/?lang=fra&topic=all-themes$  
Allow: */newsroom/section/communiques-de-presse/?lang=eng&topic=all-themes$  
Allow: */newsroom/section/communiques-de-presse/?lang=fra&topic=all-themes&pages=  
Allow: */newsroom/section/communiques-de-presse/?lang=eng&topic=all-themes&pages=  
Disallow: */newsroom/section/communiques-de-presse/? 
  
# 3) Dossiers de presse – allow only lang param (FR/EN)  
Allow: */newsroom/section/dossiers-de-presse/?lang=fra  
Allow: */newsroom/section/dossiers-de-presse/?lang=eng  
Disallow: */newsroom/section/dossiers-de-presse/?  
  
# 4) Revues de presse – allow only vu-dans-la-presse (+ pagination) FR/EN  
Allow: */newsroom/section/revues-de-presse/?lang=fra&cat=vu-dans-la-presse$  
Allow: */newsroom/section/revues-de-presse/?lang=eng&cat=vu-dans-la-presse$  
Allow: */newsroom/section/revues-de-presse/?lang=fra&cat=vu-dans-la-presse&pages=  
Allow: */newsroom/section/revues-de-presse/?lang=eng&cat=vu-dans-la-presse&pages=  
Disallow: */newsroom/section/revues-de-presse/?  
  
# 5) Contacts – allow only lang param (FR/EN)  
Allow: */newsroom/section/contacts/?lang=fra  
Allow: */newsroom/section/contacts/?lang=eng  
Disallow: */newsroom/section/contacts/?  
  
# 6) Search – disallow all  
Disallow: */newsroom/?s  
Disallow: */newsroom/?lang=eng&s  
Disallow: */newsroom/?lang=fra&s  
  
# 7) Statements – allow lang only & all-themes (+ pagination)   
Allow: */newsroom/section/statements/?lang=fra$   
Allow: */newsroom/section/statements/?lang=eng$   
Allow: */newsroom/section/statements/?lang=fra&topic=all-themes$   
Allow: */newsroom/section/statements/?lang=eng&topic=all-themes$   
Allow: */newsroom/section/statements/?lang=fra&pages=   
Allow: */newsroom/section/statements/?lang=eng&pages=   
Allow: */newsroom/section/statements/?lang=fra&topic=all-themes&pages=  
Allow: */newsroom/section/statements/?lang=eng&topic=all-themes&pages=  
Disallow: */newsroom/section/statements/? 
 
# 8) Selection – disallow all 
Disallow: */newsroom/fr/selection/ 
Disallow: */newsroom/en/selection/ 
Sitemap: https://totalenergies.com/sitemap.xml

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived