All Collections
How to deindex low SEO value pages?
How to deindex low SEO value pages?
C
Written by Celina
Updated over a week ago

It’s absolutely necessary to audit a website periodically in order to separate the wheat from the chaff, and only show good quality content to Google. And, therefore, to deindex low quality content (following Panda criteria in particular). But what's the best way to deindex a web page?

Why deindex?

  • Content is outdated

  • The content is irrelevant to SEO: shopping cart, internal search engine’s search pages, etc.

  • Content with copyright: images, etc.

  • Test site

  • PDF, Word, Excel, etc.

Theory: there are several tools to deindex / block pages

  1. Robots.txt file: prohibits crawling, not indexing.
    Warning: if a page has already been indexed in the past, or if an external website links to it, the robots.txt will not be enough to deindex it because its main function is to prevent this page’s crawl once the Googlebot is on the website.

  2. Robots meta tag: with a "noindex" attribute, it prohibits indexing. But not the crawl since the tag must be read.

  1. Access by password: limit access by adding a password.

  1. X-Robots-Tag Directive: prohibits indexing; mostly used to deindex PDFs, etc. (which have no HTML code, and therefore no usable tag).
    http://robots-txt.com/x-robots-tag/
    https://developers.google.com/search/reference/robots_meta_tag?hl=fr

  1. Search Console: prohibits indexing as an emergency procedure for 90 days only (Google Index > URL to remove). We recommend that you don’t use this feature as it is a non-definitive solution.

Practice: if the content is indexed and you want to deindex it

Let’s take the example of a website that wants to deindex the "/shoppingcart" pages that are currently seen by Google. Here is the step-by-step process:

  1. Check that there are no robots.txt blocking access to the pages concerned. If so, remove those pages from robots.txt.

  1. Put a Robots meta tag in “noindex” on all the "/shoppingcart" pages in question. Or use any other deindexing method.

  1. Force Google to crawl all those pages via a specific XML sitemap submitted in the Search Console (little tip to go faster than just "wait for it to crawl"). If these pages were already on a sitemap, then you can leave them there, but you will have to go find them and remove them one by one at the end of step 4 below.

  1. Once all these pages are deindexed (check directly on the Search Console), crawl them in the robots.txt and remove the sitemap. Keep the Robots meta tag in “noindex” to avoid unintentional reindexing if an external website links to these pages.

Here we are, now you’re a deindexing pro! 😊

Have you found your answer?

Did this answer your question?