Google releases new “How Search Works” episode about crawling

Google has released the latest installment of its educational video series How Search Works, explaining how its search engine discovers and accesses web pages through crawling.

Google Engineer Detailed Crawl Process

In a seven-minute episode hosted by Google analyst Gary Illyes, the company details the technical aspects of how Googlebot (the software Google uses to crawl the web) works.

Illyes outlines the steps Googlebot takes to find new and updated content from the internet’s trillions of web pages and make it searchable on Google.

Ilies explains:

“Most new URLs that Google discovers come from other known pages that Google has previously crawled.

Consider a news site with various category pages that link to individual news articles.

Google can find most published articles by visiting the category page from time to time and extracting the URL that leads to the article. ”

How Googlebot crawls the web

Googlebot starts by discovering new URLs by following links from known web pages. This process is called URL discovery.

Avoid site overload by crawling each site at its own customized speed based on server response time and content quality.

Googlebot uses current versions of the Chrome browser to render pages, run JavaScript, and correctly display dynamic content loaded by scripts. It also only crawls public pages, not pages after login.

Improved discovery and crawlability

Illyes emphasized the usefulness of sitemaps (XML files that list a site’s URLs) to help Google discover and crawl new content.

He advised developers to let their content management systems automatically generate sitemaps.

Optimizing technical SEO factors such as your site’s architecture, speed, and crawl directives will also improve its crawlability.

Here are some additional tactics to make your site more crawlable.

Avoid crawl budget exhaustion – If your website is updated frequently, Googlebot’s crawl budget may be exceeded and new content may not be discovered. Careful configuration of your CMS and use of rel= “next” / rel= “prev” tags can help.
Implement proper internal linking – Linking to new content from category pages or hub pages allows Googlebot to discover new URLs. Effective internal link structure improves crawlability.
Ensure pages load quickly – Sites that are slow to respond to Googlebot acquisition may have their crawl speeds throttled. Optimizing your page performance will make it crawl faster.

Eliminate soft 404 errors – Fixing soft 404s caused by CMS misconfigurations ensures that URLs are directed to valid pages and improves crawl success rates.
Consider adjusting robots.txt – Strict robots.txt can block useful pages. An SEO audit may uncover restrictions that can be safely removed.

Latest educational video series

The latest video comes after Google last week launched an educational “How Search Works” series that sheds light on the search and indexing process.

Our newly released episode on Crawls provides insight into one of the most basic operations of search engines.

In the coming months, Google will produce additional episodes exploring topics such as indexing, quality assessment, and search refinement.

The series is available on Google Search Central’s YouTube channel.

FAQ

What does Google describe as the crawl process?

Google’s crawl process, outlined in a recent episode of our “How Search Works” series, includes the following key steps:

Googlebot discovers new URLs by following links from known pages it has previously crawled.

It strategically crawls your site at a customized speed to avoid overloading your servers, taking into account response time and content quality.
The crawler also uses the latest version of Chrome to render pages, correctly display content loaded by JavaScript, and access only publicly available pages.
Optimizing technical SEO elements and using a sitemap will help Google crawl your new content.

How can marketers ensure their content is effectively discovered and crawled by Googlebot?

Marketers can employ the following strategies to increase the discoverability and crawlability of their content by Googlebot.

Implement automatic sitemap generation within your content management system.
Focus on optimizing technical SEO factors like site architecture and loading speed, and use crawl directives appropriately.

By configuring your CMS efficiently and using pagination tags, you ensure that frequent content updates don’t deplete your crawl budget.
Create an effective internal link structure to help discover new URLs.
Check and optimize your website’s robots.txt file to make sure it’s not overly restrictive for Googlebot.

Source link

What's Hot

AI technology takes marine navigation to a new level

Google’s parent company is still thriving as it shifts to inject more AI technology into search

Zuckerberg opposes China’s blockade of AI technology

Google releases new “How Search Works” episode about crawling

Google Pixel 7 Pro drops to just $530 as Amazon cuts price of 256GB model by $460

Structure, consume, learn, retire: Google’s learning patterns

Google Meet now lets users add annotations during presentations

Google Maps alternative app gets major update on iPhone, good news for CarPlay

OneDrive will soon be able to import files from Google Drive

Japan’s antitrust agency orders Google to fix ad search restrictions affecting Yahoo

How Amazon Prime’s ‘Fallout’ series highlights the power of post-apocalyptic video game IP

Popular household items are on sale at Amazon with up to 77% off

In the Amazon, butterflies play a key role in the fight against climate change

CeraVe Skin Care and Breezy Blouse available on Amazon starting at $7

Why Apple is betting big on India

Security Bite: Cybercriminals take advantage of Apple Store Online third-party pickup

Protecting against iPhone password reset attacks: How-to

Apple just canceled major products, reports say

Our Picks

AI technology takes marine navigation to a new level

Google’s parent company is still thriving as it shifts to inject more AI technology into search

Zuckerberg opposes China’s blockade of AI technology

Subscribe to Updates

What's Hot

Google releases new “How Search Works” episode about crawling

Google Engineer Detailed Crawl Process

How Googlebot crawls the web

Improved discovery and crawlability

Latest educational video series

FAQ

What does Google describe as the crawl process?

How can marketers ensure their content is effectively discovered and crawled by Googlebot?

Related Posts

Subscribe to Updates