Log analysis is now an essential tool for any SEO agency to optimize a website, especially sites offering Google several thousand pages to crawl.

This is an effective solution and should be carried out at least once a year in order to validate the good SEO health of a website. It will improve the crawl of your site, the position of your strategic expressions and thus increase your traffic and your conversions. It is also often the only solution to unlock delicate SEO situations and facilitate decision-making.

Here we will focus on the art of segmentation in log analysis. This step is essential and will have an impact on your entire study.


That’s it, you start analyzing logs: Bravo! The Logs files have been successfully recovered, are usable and you have selected your preferred analysis software. The first step, and the most important because it will ultimately define all your analysis, is the segmentation of your logs by major category.

SEO agency will allow you to compare different types of pages according to SEO criteria and to analyze abnormal crawl behavior.

An analysis of well-segmented logs will allow you to easily answer these types of questions:

  • Are my Products Pages crawled more often than my Categories Pages?
  • What is my crawl loss percentage?
  • What is my crawl window? (how long does it take Google to visit all the strategic pages of my site at least once)
  • Which category of my catalog is most often visited by Google?
  • Are my http pages still crawled?
  • How do I optimize the performance of my .htAccess on old redirects?

Segmentation allows you to categorize certain pages of your tree structure in relation to others and to compare them. The way to optimize SEO pages is not the same depending on the nature, the role and the type of expression targeted by the page (long tail / generic).

Thus, your segmentation must reflect your SEO strategy and allow you to obtain reliable data to guide your future optimizations.

Tools like OnCrawl make it easy to configure these segmentations thanks to sorting parameters on URLs, but also Onsite parameters, such as a minimum of content, internal links or the presence or absence of semantic markup.

All this information can then be used to compare the impact of SEO optimizations on a site’s crawl.

Top 5 of our favorite segmentations

Here are our top 5 segmentations that should be in any log analysis:


One of the first segmentations to be implemented. It will allow you to quickly improve the impact of your crawl budget, or even increase it on your site: Detect non-strategic pages for your SEO that are still being crawled.

By carrying out this segmentation, we can realize the number of hits “wasted” on the site and which could benefit other more strategic pages.

Examples of detected pages:

Most often, it is resource pages that are detected with this technique or pages with duplicate content (filters, tracking, etc.). A keyword analysis and a very good knowledge of the site are necessary to do this sorting.


  • Validate the fact that the crawled page is really not a non-strategic page for your SEO for Psychologists.
  • Implementation of canonical tag on filtered pages. Be careful, you have to be aware that Googlebot is very fond of canonical tags and does not devote a significant part of your crawl budget to them. These should only be used if you have no choice.
  • Setting up a Meta Robots NoIndex, NoFollow tag
  • Remove this page from the sitemap
  • Reduce the number of internal links pointing to this page
  • As soon as this page is no longer indexed, block it with the robots.txt
  • Monitor hits on these newly blocked pages.


This second segmentation is of course linked to switching the site to https. It’s sometimes amazing how Google can continue to crawl old http versions despite a migration that took place years ago. Even if 301 redirects are in place, it is always better to have a site that is mostly crawled on its final version.

It is also, of course, a method to know if Google has taken into account the redirections and the new version of a site after migration.


If your crawl budget is high on your old http version, we advise you to:

  • Check for 301 redirects (in place? accessible for Googlebots?)
  • Check your sitemap
  • Check your internal linking
  • Request or update your external netlinking (at least the most powerful sites)


The purpose of this segmentation is to detect whether your editorial pages or your e-commerce pages are regularly hit by search engines. An editorial page is for your vocation to offer more content than the average, with a number of varied expressions and we know that Google is very fond of this. An e-commerce page is very often the page you want to see indexed because it is the one that allows the user to convert without multiplying the clicks. These are most of the time the pages that generate the most business.

This analysis will therefore allow you to check the impact of your content on the crawl, in terms of volume, and the impact of your internal linking.


If your e-commerce pages, the most strategic, are much less crawled than your editorial pages, we advise you to:

  • Check the number of words that generates an increase in the Google crawl
  • Create content on your Products Pages
  • Check that there are no orphan pages (invisible in your internal mesh)
  • Intelligently mesh certain product pages from your editorial pages
  • Dedicated sitemap
  • Netlinking to your e-commerce pages

These recommendations apply of course to editorial pages, if you wish to position yourself on non-transactional expressions. In this case, we recommend an internal mesh between your editorial pages and to be particularly vigilant on the contents.


To achieve this segmentation, it will be essential to know the flagship products of the site, those which convert the best.

The objective here is to compare these driving products for your site to new products or new categories that you want to see convert. We will therefore look at:

  • Crawl level of top sellers VS new products
  • The level of internal links to Top Selling products
  • The content offered by the top sellers
  • The crawl frequency > How often Google goes on your new products VS the top Sale. In principle, this frequency should be as short as possible so that Google quickly takes into account the optimizations you can make on your new pages.


  • Take inspiration from the nature of Top Sellers pages: Title, meta description, meta data, photos, CTA, content
  • Develop internal links to your new products (promotion on the homepage/universe/category + cross selling?)
  • Check that new products are in the sitemap? If necessary, dedicate a dedicated sitemap to them…


This segmentation is of course aimed at international sites. If you want to improve your positions on several languages ​​and check that a language is not behind for Googlebot, this is the segmentation you need!

Here, we will therefore separate the different languages ​​of the site, but also the different types of pages offered. Thus, we will be able to compare the crawl rate of the Italian Category pages VS the French Category pages.


  • Validate the international strategy: What are the most strategic languages? This must be felt in the Google crawl
  • Check that international SEO is well established on all pages of the site (lang tag, hreflang, adapted content, etc.)
  • Check the sitemaps (1 sitemap per language at least!)
  • Develop netlinking accordingly

We hope to have made you want to carry out a log analysis on your site, and to have clarified your vision on the possible segmentations. As always with log analyses, be careful to take into account the life of a website before selecting your analysis periods!


Please enter your comment!
Please enter your name here