SoundCloud / robots.txt
← back to soundcloud.com · change detected 5h ago (2026-06-25T22:38:12Z) · modified +60 −29
What changed
- 🤖 AIBlocked Amazonbot (Amazon) entirely
- 🤖 AIBlocked Applebot-Extended (Apple) entirely
- 🤖 AIBlocked Bytespider (ByteDance) entirely
- 🤖 AIBlocked ChatGPT-User (OpenAI) entirely
- 🤖 AIBlocked Claude-Web (Anthropic) entirely
- 🤖 AIBlocked cohere-ai (Cohere) entirely
- 🤖 AIBlocked OAI-SearchBot (OpenAI) entirely
- 🤖 AIBlocked PerplexityBot (Perplexity) entirely
- Removed all rules for user-agent ‘Brightbot’
- 🤖 AIUnblocked omgili (Webz.io) - rules removed
- Removed all rules for user-agent ‘omgilibot’
- Added Disallow: /*? for *
- Added Disallow: /messages for *
- Added Disallow: /search for *
- Added Disallow: /settings for *
- Added Disallow: /stream for *
- Added Disallow: /upload for *
- Added Disallow: /you/ for *
- Removed Disallow: (empty) for *
- 🤖 AIAdded Allow: /$ for anthropic-ai
- 🤖 AIAdded Allow: /accessibility-statement for anthropic-ai
- 🤖 AIAdded Allow: /community-guidelines for anthropic-ai
- 🤖 AIAdded Allow: /company for anthropic-ai
- 🤖 AIAdded Allow: /discover for anthropic-ai
- 🤖 AIAdded Allow: /download for anthropic-ai
- 🤖 AIAdded Allow: /getstarted for anthropic-ai
- 🤖 AIAdded Allow: /go for anthropic-ai
- 🤖 AIAdded Allow: /imprint for anthropic-ai
- 🤖 AIAdded Allow: /jobs for anthropic-ai
- 🤖 AIAdded Allow: /pro for anthropic-ai
- 🤖 AIAdded Allow: /sitemap for anthropic-ai
- 🤖 AIAdded Allow: /sitemapIndex for anthropic-ai
- 🤖 AIAdded Allow: /stories for anthropic-ai
- 🤖 AIAdded Allow: /terms-of-use for anthropic-ai
- 🤖 AIAdded Allow: /topic for anthropic-ai
- 🤖 AIAdded Allow: /transparency-reports for anthropic-ai
- 🤖 AIAdded Allow: /$ for CCBot
- 🤖 AIAdded Allow: /accessibility-statement for CCBot
- 🤖 AIAdded Allow: /community-guidelines for CCBot
- 🤖 AIAdded Allow: /company for CCBot
- 🤖 AIAdded Allow: /discover for CCBot
- 🤖 AIAdded Allow: /download for CCBot
- 🤖 AIAdded Allow: /getstarted for CCBot
- 🤖 AIAdded Allow: /go for CCBot
- 🤖 AIAdded Allow: /imprint for CCBot
- 🤖 AIAdded Allow: /jobs for CCBot
- 🤖 AIAdded Allow: /pro for CCBot
- 🤖 AIAdded Allow: /sitemap for CCBot
- 🤖 AIAdded Allow: /sitemapIndex for CCBot
- 🤖 AIAdded Allow: /stories for CCBot
- ... and 71 more change(s)
Diff old (2026-06-20T01:10:31Z) → new (2026-06-25T22:38:12Z)
| @@ -1,34 +1,65 @@ | ||
| 1 | - # This is for the m.soundcloud.com hostname | |
| 1 | + # ============================================================================= | |
| 2 | + # robots.txt for soundcloud.com | |
| 3 | + # Updated: 2026-05-05 | |
| 4 | + # ============================================================================= | |
| 2 | 5 | |
| 3 | - User-Agent: * | |
| 4 | - Disallow: | |
| 6 | + # AI Crawlers: editorial only, no UGC training | |
| 7 | + User-Agent: anthropic-ai | |
| 8 | + User-Agent: ClaudeBot | |
| 9 | + User-Agent: Claude-Web | |
| 10 | + User-Agent: GPTBot | |
| 11 | + User-Agent: ChatGPT-User | |
| 12 | + User-Agent: OAI-SearchBot | |
| 13 | + User-Agent: CCBot | |
| 14 | + User-Agent: PerplexityBot | |
| 15 | + User-Agent: Google-Extended | |
| 16 | + User-Agent: Applebot-Extended | |
| 17 | + User-Agent: Bytespider | |
| 18 | + User-Agent: Amazonbot | |
| 19 | + User-Agent: Meta-ExternalAgent | |
| 20 | + User-Agent: cohere-ai | |
| 5 | 21 | |
| 6 | - User-Agent: anthropic-ai | |
| 22 | + # Homepage | |
| 23 | + Allow: /$ | |
| 24 | + | |
| 25 | + # Platform: Legal & Policy | |
| 26 | + Allow: /terms-of-use | |
| 27 | + Allow: /community-guidelines | |
| 28 | + Allow: /transparency-reports | |
| 29 | + Allow: /accessibility-statement | |
| 30 | + Allow: /imprint | |
| 31 | + | |
| 32 | + # Platform: Discovery & Editorial | |
| 33 | + Allow: /discover | |
| 34 | + Allow: /stories | |
| 35 | + Allow: /topic | |
| 36 | + | |
| 37 | + # Platform: Product & Marketing | |
| 38 | + Allow: /pro | |
| 39 | + Allow: /download | |
| 40 | + Allow: /jobs | |
| 41 | + Allow: /go | |
| 42 | + Allow: /getstarted | |
| 43 | + | |
| 44 | + # Platform: Corporate | |
| 45 | + Allow: /company | |
| 46 | + | |
| 47 | + # Platform: Technical | |
| 48 | + Allow: /sitemap | |
| 49 | + Allow: /sitemapIndex | |
| 50 | + | |
| 51 | + # Block everything else (catches all UGC at root paths) | |
| 7 | 52 | Disallow: / |
| 8 | 53 | |
| 9 | - User-Agent: ClaudeBot | |
| 10 | - Disallow: / | |
| 54 | + # Search engines and all other crawlers: index UGC, block low-value paths | |
| 55 | + User-Agent: * | |
| 56 | + Disallow: /search | |
| 57 | + Disallow: /you/ | |
| 58 | + Disallow: /stream | |
| 59 | + Disallow: /upload | |
| 60 | + Disallow: /settings | |
| 61 | + Disallow: /messages | |
| 62 | + Disallow: /*? | |
| 11 | 63 | |
| 12 | - User-Agent: CCBot | |
| 13 | - Disallow: / | |
| 14 | - | |
| 15 | - User-Agent: GPTBot | |
| 16 | - Disallow: / | |
| 17 | - | |
| 18 | - User-Agent: Google-Extended | |
| 19 | - Disallow: / | |
| 20 | - | |
| 21 | - User-Agent: Meta-ExternalAgent | |
| 22 | - Disallow: / | |
| 23 | - | |
| 24 | - User-Agent: omgili | |
| 25 | - Disallow: / | |
| 26 | - | |
| 27 | - User-Agent: omgilibot | |
| 28 | - Disallow: / | |
| 29 | - | |
| 30 | - User-Agent: Brightbot | |
| 31 | - Disallow: / | |
| 32 | - | |
| 33 | - sitemap: https://soundcloud.com/sitemap.xml | |
| 34 | - sitemap: https://soundcloud.com/sitemapIndex.xml | |
| 64 | + Sitemap: https://soundcloud.com/sitemap.xml | |
| 65 | + Sitemap: https://soundcloud.com/sitemapIndex.xml |