Info
CCBot/2.0

- User-Agent:
CCBot/2.0 (https://commoncrawl.org/faq/)
- Host:
- Updated:
2026-02-08 14:23:40 (UTC)
- Rating:
-
Medium ( Confirmed 8923 times since 2025-02-02 )
- Description:
CCBot/2.0 is a web crawler developed by the Common Crawl Foundation, a non-profit organization dedicated to providing free access to web data for research and analysis. This crawler is based on the Apache Nutch project and utilizes the Apache Hadoop framework for processing and extracting crawl candidates.
The user-agent string for CCBot/2.0 is:
CCBot/2.0 (+http://www.commoncrawl.org/bot.html)This string includes a link to the bot's information page, allowing website administrators to verify its identity.
CCBot/2.0 is designed to minimize its impact on web servers. It employs an adaptive back-off algorithm that slows down requests if a server responds with HTTP 429 (Too Many Requests) or 5xx (Server Error) status codes. By default, the crawler waits a few seconds before sending the next request to the same site.
To control the crawl rate, website administrators can specify a crawl delay in their robots.txt file. For example, to limit CCBot to one request every two seconds, the following lines can be added:
User-agent: CCBot Crawl-delay: 2This configuration instructs CCBot to wait at least two seconds between consecutive requests to the site.
For more detailed information about CCBot/2.0, including its IP address ranges and compliance with the Robots Exclusion Protocol, please refer to the Common Crawl FAQ page.
- IP addresses:
-
18.97.9.175 , 18.97.14.80 , 18.97.14.84 , 18.97.14.88 , 18.97.14.85 , 18.97.14.89 , 18.97.9.168 , 18.97.9.173 , 18.97.14.82 , 18.97.9.172 , 18.97.9.170 , 18.97.9.174 , 18.97.9.169 , 18.97.14.83 , 18.97.14.81 , 18.97.9.171 , 18.97.14.86 , 18.97.14.90 , 18.97.14.91
- Countries:
-
United States(US)