NeuralCrawl

πŸ‡ΊπŸ‡Έ Zscaler

zscaler.com · Top 1000 websites · rank #873 · Computer Security · live robots.txt ↗

AI crawler access (latest snapshot, 13h ago)

blocked restricted allowed faded = inherited from the * wildcard group

GPTBot
ChatGPT-User
OAI-SearchBot
ClaudeBot
Claude-User
Claude-SearchBot
anthropic-ai
Claude-Web
CCBot
Google-Extended
Applebot-Extended
PerplexityBot
Perplexity-User
Bytespider
Amazonbot
FacebookBot
meta-externalagent
meta-externalfetcher
cohere-ai
AI2Bot
Diffbot
omgili
YouBot
DuckAssistBot
MistralAI-User
PanguBot
Timpibot

Current robots.txt 3932 bytes · sha256 14498d61ca7d · raw

    User-agent: *
    Crawl-delay: 10

    Allow: /*/resources?topic=*
    Allow: /*/blogs?type=*
    Allow: /*/blogs?topic=*
    Allow: /*/resources?type=*
    Allow: /*/zpedia?type=*
    Allow: /*/resources/webinars?type=*
    Allow: /*/resources/webinars?topic=*
    Allow: /*/careers/search?office=*

    Allow: /resources?topic=* 
    Allow: /blogs?type=* 
    Allow: /blogs?topic=* 
    Allow: /resources?type=* 
    Allow: /zpedia?type=* 
    Allow: /resources/webinars?type=* 
    Allow: /resources/webinars?topic=* 
    Allow: /careers/search?office=*


    # Zscaler: leftovers
    Disallow: *_archive.html
    Disallow: /community-api-webhook.php

    # Next settings
    Disallow: /preview_*

    # Zscaler: Disallow email-signature page
    Disallow: /email-signature
    Disallow: /*/email-signature

    # Query strings
    Disallow: /*?*

    # Allow _next images
    Allow: /_next/image?url=*

    # To be remove after proxy
    # CSS, JS, Images
    Allow: /core/*.css$
    Allow: /core/*.css?
    Allow: /core/*.js$
    Allow: /core/*.js?
    Allow: /core/*.gif
    Allow: /core/*.jpg
    Allow: /core/*.jpeg
    Allow: /core/*.png
    Allow: /core/*.svg
    Allow: /profiles/*.css$
    Allow: /profiles/*.css?
    Allow: /profiles/*.js$
    Allow: /profiles/*.js?
    Allow: /profiles/*.gif
    Allow: /profiles/*.jpg
    Allow: /profiles/*.jpeg
    Allow: /profiles/*.png
    Allow: /profiles/*.svg
    # Directories
    Disallow: /core/
    Disallow: /profiles/
    # Files
    Disallow: /README.txt
    Disallow: /web.config
    # Paths (clean URLs)
    Disallow: /admin/
    Disallow: /comment/reply/
    Disallow: /filter/tips
    Disallow: /node/add/
    Disallow: /search/
    Disallow: /user/register/
    Disallow: /user/password/
    Disallow: /user/login/
    Disallow: /user/logout/
    # Paths (no clean URLs)
    Disallow: /index.php/admin/
    Disallow: /index.php/comment/reply/
    Disallow: /index.php/filter/tips
    Disallow: /index.php/node/add/
    Disallow: /index.php/search/
    Disallow: /index.php/user/password/
    Disallow: /index.php/user/register/
    Disallow: /index.php/user/login/
    Disallow: /index.php/user/logout/

    # Zscaler: CSS, JS, Images
    Allow: /themes/*.css$
    Allow: /themes/*.css?
    Allow: /themes/*.js$
    Allow: /themes/*.js?
    Allow: /themes/*.gif
    Allow: /themes/*.jpg
    Allow: /themes/*.jpeg
    Allow: /themes/*.png
    Allow: /themes/*.svg
    Allow: /themes/*.woff2

    # Zscaler: Directories
    Disallow: /includes/
    Disallow: /misc/
    Disallow: /modules/
    Disallow: /profiles/
    Disallow: /scripts/
    Disallow: /themes/

    # Zscaler: Files
    Disallow: /CHANGELOG.txt
    Disallow: /cron.php
    Disallow: /INSTALL.mysql.txt
    Disallow: /INSTALL.pgsql.txt
    Disallow: /INSTALL.sqlite.txt
    Disallow: /install.php
    Disallow: /INSTALL.txt
    Disallow: /LICENSE.txt
    Disallow: /MAINTAINERS.txt
    Disallow: /update.php
    Disallow: /UPGRADE.txt
    Disallow: /xmlrpc.php

    # Zscaler: Paths (clean URLs)
    Disallow: /admin/
    Disallow: /comment/reply/
    Disallow: /filter/tips/
    Disallow: /node/add/
    Disallow: /search/
    Disallow: /user/register/
    Disallow: /user/password/
    Disallow: /user/login/
    Disallow: /user/logout/

    # Query strings
    Disallow: /*?*

    Disallow:  /en$
    Disallow: /en/*

    # Zscaler: Disallow cdn URLs
    Disallow: /cdn/*

    # Zscaler: GWT Error
    Disallow: /drupal
    Disallow: /drupal/*

    # Zscaler: leftovers
    Disallow: *_archive.html
    Disallow: /community-api-webhook.php

    # Disallow for proxy pages of m7
    Disallow: /media/oembed
    Disallow: /*/media/oembed
    Disallow: /index.php/media/oembed
    Disallow: /index.php/*/media/oembed

    # Disallow IndexNowKey
    Disallow: /indexNowKey.txt

    # To be remove after proxy
    # CSS, JS, Images ends

    Sitemap: https://www.zscaler.com/sitemap_index.xml
  

Change history

  1. initial snapshot
    • First snapshot of robots.txt archived