Crawl budget refers to the maximum volume of site pages and associated URLs that a search engine bot (most often Googlebot) is both able and willing to systematically traverse, or ‘crawl,’ to refresh its ranking data. The way a crawl budget is calculated can be complex, but it is primarily governed by two main elements: crawl demand and the crawl rate limit.
To better grasp the concept of crawl budget, understanding crawl demand and the crawl rate limit is helpful.
Crawl demand is quite straightforward. The more popular or high-traffic a website is, the greater the necessity (and demand) for it to be crawled regularly. Googlebot, in particular, wants to ensure it maintains up-to-date records for domains that are frequently visited.
For web pages that are less popular, crawl demand centers on ensuring their freshness. Google aims to prevent outdated or stale content from dominating the top spots in the SERPs.
The crawl rate limit dictates the highest number of simultaneous connections that Googlebot can employ to crawl a website, along with the required pause time between each connection. Website owners have the ability to configure limits on this rate, but the server’s own capacity can also influence this figure.
A web server that exhibits fast response times will be assigned a higher crawl limit. Conversely, Googlebots will automatically reduce their crawling speed to avoid negatively impacting the performance of a weaker server.