Insight Enterprises / robots.txt snapshot
← back to insight.com · fetched 2026-06-20T01:10:31Z (17h ago) · HTTP 200 · 1096 bytes · sha256 86d490e4f4480992 · raw
final URL: https://www.insight.com/robots.txt
| 1 | # Robots.txt for Insight.com |
| 2 | # Use specialized blocks only if rules differ from the global policy. |
| 3 | |
| 4 | User-agent: * |
| 5 | # Allow specific parameters first |
| 6 | Allow: /*?qtype= |
| 7 | Allow: /*?pq= |
| 8 | Allow: /*?identifier=shopping |
| 9 | Allow: /*?partnermessage |
| 10 | Allow: /insightweb/*.css$ |
| 11 | Allow: /*.html |
| 12 | Allow: /*/shop/product/ |
| 13 | Allow: /*%23* |
| 14 | |
| 15 | # Block all other parameters and system folders |
| 16 | Disallow: /*?* |
| 17 | Disallow: /*/search*.html |
| 18 | Disallow: /insightweb/ |
| 19 | Disallow: /flytrap/ |
| 20 | Disallow: /content/dam/insight-web/*/solutions/service-provider/microsite/assets/ |
| 21 | Disallow: /content/dam/insight-web/*/pdfs/ |
| 22 | Disallow: /content/dam/insight/ |
| 23 | Disallow: /content/dam/global/*/pdfs/ |
| 24 | Disallow: /content/insight-web/*/help/* |
| 25 | Disallow: /content/insight-web/*/client/* |
| 26 | Disallow: /content/insight-web/*/Sandbox/* |
| 27 | Disallow: /content/insight-web/*/sandbox/* |
| 28 | |
| 29 | ############################ |
| 30 | # BLOCKED CRAWLERS |
| 31 | ############################ |
| 32 | User-agent: CCBot |
| 33 | User-agent: FacebookBot |
| 34 | User-agent: NeevaAI |
| 35 | User-agent: Bytespider |
| 36 | User-agent: Firecrawl |
| 37 | User-agent: Kadoa |
| 38 | User-agent: ImagesiftBot |
| 39 | Disallow: / |
| 40 | |
| 41 | |
| 42 | Sitemap: https://www.insight.com/sitemap.xml |