According to TOPBBACOLLEGES, those pages of a website that have been included in the database and thus in the index of a search engine are indexed. As a result, indexed websites are a prerequisite for being found online using search engine results lists.
In order for web pages to be included in the index of a search engine, they must first be crawled by the search engine. The bots and crawlers that take on this task, among other things, move from link to link and call up a large number of websites. These are evaluated in terms of their relevance and included in the index. The following applies: Only indexed websites can appear and rank on the search result pages. And only those websites that rank bring the desired SEO traffic to the domain.
Tips for indexing websites
If you want to be included in the Google index as a website operator, it is necessary to structure your own domain so clearly and clearly that the Google crawler can quickly find its way around. Ideally, all relevant sub-pages are reliably recorded and indexed by the crawler, i.e. made discoverable via the search engine. If, on the other hand, a Google bot cannot index a website, no links to the corresponding page are output in the Google SERPs.
The following tips will help the Google bot find its way around your website better and faster:
- Use a well thought-out website architecture and ensure that there are good internal links.
- Create a constantly updated XMLsitemap and submit it to Google Webmaster Tools.
- “Tidy up” regularly by checking the error pages reported in the Webmaster Tools and, if necessary, correcting the cause.
- Delete orphan pages or relink them.
- Carry out continuous monitoring in order to identify and correct problems with indexing at an early stage.
In addition, the naming of your URL s plays an important role. In the best case scenario, the names never change, but remain permanently. If, for example, changes in the menu structure make an adjustment unavoidable, technical precautions must be taken to avoid duplicates or error pages.
Pitfalls that need to be considered, as otherwise there is a risk of a complete loss of ranking, occur particularly frequently in the context of a relaunch. In such a case, first carefully weigh up the benefits and effort involved in converting the URLs of an entire website at a later date.
View the indexing status of a website
Google offers website operators the opportunity to see how many subpages of a domain have been indexed by the search engine. You can choose from the site: query in the search engine itself and checking the indexing status in Google Webmaster Tools as the more reliable variant.
View the indexing status in the Google Webmaster Tools
A site: query is primarily used as a rough guide. In the case of larger domains in particular, it is advisable to query the indexing status using the Google Webmaster Tools. You cannot see any web pages, but you can see the total number of indexed pages. And not only that: With the help of the Webmaster Tools you can see how the indexed pages of your domain have developed over the past twelve months.
Make sure that the number of URLs indexed by Google is what you expected. If your expected value differs significantly from the actual value, a structural problem may be the cause. Even if there is a sudden significant increase or decrease in the number of indexed websites, this can indicate problems that need to be resolved.
Expected and actual number of indexed web pages
Indexing status is of no value until website owners know how many URLs should actually be indexed. You can then compare the total number of indexed websites you expect to see against the indexing status.
If the expected value deviates significantly from the actual value, there may be various reasons for the cause, which are shown here as an example.
More urls indexed than expected:
- Duplicates due to different spellings in the URL (e.g. disregard of lower and upper case)
- URLs that no longer exist are output as a valid result (200 status code) instead of being marked with a 404 or 410 status code
- Use of the session ID in URLs
- URLs with unusable parameters that were not excluded from indexing
- outdated, faulty sitemap
Fewer URLs indexed than expected:
- Domain was recently put online
- Missing (internal) links to some sub-pages, so-called orphaned pages
- Individual pages are unintentionally set to noindex and are excluded from indexing
- Folders and files unintentionally blocked by robots.txt
- incorrect use of canonical tags
- Duplicate content or content without added value for the visitor
- Amount of subpages is too large and was not fully crawled by the Google bot
Why is indexing control important?
The following principle applies to the structural design of a website: Use as many URLs as necessary and make sure not to “bloat” the website unnecessarily. This rule becomes even more important when it comes to indexing websites. Because from an SEO point of view, every URL is rarely so relevant that it would have to be found using a search engine like Google. Instead, the goal should be that a user coming from the web search goes directly to the website that is most relevant to him.
These non-relevant websites primarily include those that are not optimized for a specific keyword.
This includes:
- Pages not relevant to the topic (affects all websites)
- Category pages with pagination (applies to online shops)
- Checkout (applies to online shops)
- dynamic pages (e.g. internal search)
All non-topic-related pages can in principle be set to noindex. Examples of this would be imprint, terms and conditions, right of withdrawal, data protection declaration, etc.
There are also pagination pages in shops. The pagination (scrolling function) in the online shop results in several, almost identical category pages (page 1, page 2…). So that these pages do not compete with each other in the ranking, they should be integrated in an SEO-compliant manner in order to prevent duplicate content.
Simple content pages that provide little helpful information have no right to exist and should be deleted or upgraded.
It is most important that website operators regularly check the indexing status for its development. In this way, on the one hand, action can be taken quickly if Google crawls a lot of new website content without it providing the user with new information. On the other hand, attacks from the Black Hat SEO area can be detected more quickly and rendered harmless.
In the encyclopedia article Index / Indexing you will find out how to have a website indexed, how to prevent indexing and how to return to the Google index with websites that have been removed from Google.