Google Crawl Rate

Monday, March 9th, 2009

I recently read a forum post discussing Google crawl rate and whether it is affected by PageRank. It seems that this is a common question, so I thought I’d write a short answer to the question.

Crawl rate is affected by several factors, one of which is the number and quality of the links to a site. Search engine robots discover website pages by following links, so the more links a site has from external sources, the more frequently a spider will visit the site. Links affect PageRank, which means that there is a positive correlation between PageRank and Crawl frequency.

Although higher PR sites tend to get crawled more often, the biggest determining factor affecting crawl rate is the frequency with which content is updated. If a site’s content has changed when a spider visits, it will generally cause the spider to leave a shorter gap between visits. Spiders will soon “learn” how often a page is changed and adjust its time before visits accordingly.

The factors that affect crawl rate make perfect sense when considering the search engine’s purpose. Search engines aim to provide the most relevant and up to date content. If a site is changed every day, the search engines must visit every day to avoid discrepancies between the indexed page and the actual page. If a site has a high PageRank, it is seen as more “popular” and probably has more visitors than a similar low PR page, which makes providing up-to-date information more important for the search engines.