Google Search Console formerly known as Google Webmaster Tools has been significantly improved in recent years. Presently Google offers a number of useful features which make this service indispensable for SEOs. In spite of certain imperfections Google keeps developing new documents to help Console users find and fix errors.
Fixing crawl errors can be defined as a crucial part of Internet marketing. In the structure of the latter, illustrated by a vivid Portent’s scheme, it corresponds to the "Infrastructure" category.
The issue of crawl errors is not the one to be neglected, yet routine maintenance and regular checkups should prevent crawl errors from getting out of control.
Crawl Errors Configuration
One of the recently implemented features of the Search Console is the division of crawl errors into Site Errors and URL Errors categories where your webpage error details are shown. This way you can define whether the error occurred at the website level or it concerns only a certain page.
The Crawl Errors section can be easily accessed via the main dashboard. Such tools as Crawl Errors, Search Analytics, and Sitemaps are presented here for you to manage you website in the most efficient way.
This section allows you to monitor your website’s crawl errors. Checking this section once a day would be great preventive measures for your website.
As long as website errors concern the whole site, they should be managed carefully.
The errors that have occurred within the last 90 days are displayed in the Crawl Errors dashboard as follows:
If you haven’t had any errors within this period of time, you will see the following notification:
How often should site errors be checked?
Checking your website for crawl errors daily may seem to be a tedious task, but what would you do in case some important error escapes your attention?
You should definitely check for errors at least once in three months, yet daily checkups are recommended as the safest option.
The importance of Domain Name System errors can’t be overestimated because some of them can affect your site greatly.
If a DNS error has occurred, Googlebot fails to connect with your domain because of DNS timeout or lookup errors.
How important are they?
According to Google, a lot of DNS errors still allow it to connect to your website, however severe issues should be handled as soon as possible.
DNS errors are the issues of great importance. Those rendering it impossible for Google to connect with your site require your prompt and resolute actions.
How to fix DNS errors?
1. To start with, it’s recommended to use Fetch as Google tool (located in the Search Console) to monitor Googlebot.
The "Fetch and Render" function allows you to compare Google’s and user’s visions of your website, while the "Fetch" option is a quicker way to view your status of DNS connection and locate any link errors.
2. If fetching didn’t work you should check the issue with your DNS provider.
3. Make sure you have 404 or 500 DNS response codes. It will be easier to find a solution and fix 404 error or the 500 one than to deal with a DNS issue.
The most common server issue is timeout – if the Googlebot fails to connect within a certain period of time, it will give up trying.
Unlike DNS errors, server errors mean that the Googlebot is able to connect to your website but the page can’t be loaded.
Such errors occur when the server fails to cope with the increased traffic on your website. So make sure that your provider is ready to handle sudden traffic changes.
How important are they?
Server errors along with DNS errors are the issues of top urgency, as they also affect the whole website. Even if the DNS connection is enabled, server errors may still prevent your site from working.
How to fix server errors?
If you notice some errors, though your site works fine, these may be the past errors that have already been resolved. Still it would be a good idea to make sure they won’t occur in the future.
According to the Google’s instruction on dealing with server errors, you should check your site with Fetch as Google and if returns your page content properly, the problem doesn’t concern Google’s access to your website.
The first step in fixing server issues is to define what type of error you have encountered: “Timeout”, “Connect failed”, “No response” etc.
For further information on fixing of each type refer to Google Search Console help.
Robots.txt file Errors
Robots.txt file errors occur when the Googlebot fails to receive your robots.txt file. This file is optional – it’s required only to prevent Google from crawling the pages you don’t want it to crawl.
In accordance with Search Console help, in case your robots.txt file is missing, a 404 error will be displayed and the Googlebot will simply continue crawling the website.
How important are they?
For small websites that are updated not very often such errors are not of great urgency.
However, in case you upload some content every day, the issue should be resolved as soon as possible. If the Googlebot fails to load the file, it won’t be able to index the changes on your website.
How to fix robots.txt errors?
First of all, make sure the file is configured in a proper way and check the pages restricted for crawling. The existence of the line "Disallow: /" may appear to be an answer to your question "Why isn't my website showing up in the Google search?"
If you have the file double-checked but the error remains, a header checker tool will help you locate a 404 or 200 error.
Having this file is not obligatory but it may cause serious errors. If you decide to keep the file, ensure regular checkups.
URL errors, unlike site errors, concern not the whole site but only certain pages.
In the Search Console you can find main URL errors sorted by categories.
A huge number of errors seem to be too complicated? URL errors are easier to cope with if you know that the most important ones are displayed in the first place. Also, some resolved errors may still be visible.
After you have found out how to fix your URL errors and resolved them, or you do not observe them for a long time, you can just mark all of them as fixed and check them again later on.
Wait a couple of days for the Googlebot to crawl your website and you will finally understand what errors have been fixed and what still requires your attention.
Soft 404 Errors
Soft 404 errors occur in case a 404 page is showed as “found” (200).
However, a 404 page message doesn’t always indicate a 404 error. What users see is a 404 page content – an error message which notifies users that the requested page is unavailable. In many cases some useful links for users or a comical error message is displayed.
Another aspect of the 404 error response is how Google views it. The HTTP response displayed is 404 or 410 (“not found” or “gone” respectively). Here is how the request-response scheme works:
The error page is identified as a Soft 404 error one when the HTTP response header doesn’t return the “not found” code. According to Google, a 404 or 410 code always means that the requested page doesn’t exist.
Also a soft 404 error occurs when a 301 page redirects to some unrelated pages including the homepage.
As a matter of fact, a multitude of pages redirecting to the homepage can be regarded by Google as a soft 404 code instead of 301.
If a redirection concerns a closely related URL, most probably a soft 404 message won’t show up.
How important are they?
Basically, soft 404 errors don’t have to be urgently fixed, unless this message shows up on some important pages.
As to the important pages, they include all sorts of live pages, especially those connected with your website income. Such pages with the soft 404 code should be fixed promptly.
Needless to say, huge numbers of soft 404 pages are wasting Google’s crawl budget, so if their number is rising too high, you should pay your attention this problem.
How to fix soft 404 errors?
If the page is non-existing:
- If there’s not much traffic on the page, a 404 or 410 code should be displayed.
- Ensure the page redirecting (301) to a more recent one.
- Try to avoid redirecting too many non-existing pages to the homepage. It’s better to redirect them to related pages or let them 404.
If the page exists:
- Make sure there’s enough content on your page.
- Make sure your 200 page doesn’t represent a “not found” page.
Soft 404 pages may be confusing, since their cause is usually not obvious. Resolving this error on your most important pages would be a good decision.
When the Googlebot fails to crawl a page on your website because it doesn’t exist, a 404 error message shows up.
Google suggests that ignoring 404 errors on your website is rather safe. However, you definitely don’t want this error on the important pages of your website.
If you gain enough experience in dealing with 404 errors, you will be able to define, when this error requires prompt fixing and in which cases it may be ignored. According to Rand Fishkin, a 404 error is fine unless:
- there’s a lot of traffic on your page
- the access to it from external website is crucial
- page URL is exactly what users aim for
The question is what external connections are crucial and how much traffic is enough for 404 to be urgent. Annie Cushing suggests that primarily you should pay attention to backlinks and the number of your landing page visits.
Furthermore, do not ignore any kind of media that employs tracking URLs even if they don’t seem to be important.
How important are they?
The urgency of fixing such a common error as a 404 page is directly related to the importance of the page itself. Yet if the page is gone and it was unimportant, you don’t have to bother yourself with this issue.
Just accept the fact that even vast numbers of 404 errors should be simply ignored as they will keep occurring unless you locate and eliminate their cause.
How to fix 404 errors?
In order to fix a 404 error on your page, follow these tips:
- Make sure you use content management system to publish the page with draft mode disabled. Check whether the page exists.
- Check your page URL.
- Compare your website’s www and non-www, as well as http and https variations to define in which case the error shows up.
- If you don’t mind your page being dead and willing to redirect it, the new one should be closely related and relevant.
How to prevent old 404 errors from flooding your crawl report?
So you’ve decided not to revive your page – but how to make this 404 error disappear from the errors list?
Your report will keep displaying the errors only whenever yours or any external site links to the corrupt page. If a user just types the page URL, a 404 error won’t be listed.
You can perform the link fix via Google Search Console. In the URL Errors subsection you can view the links to the error page of your website.
Select the required URL.
As a rule, a quick way to find the required link is to look for it in the source code.
This procedure may take quite a lot of time, but the result is really worth it. If you delete the all the unwanted links, you will prevent the dead 404 pages from appearing in the report.
Another issue is the links from non-existing site maps. Instead of redirecting the links to the current site map, let the old maps trigger the error so that you can delete them.
Access Denied Errors
Access denied errors occur when the Googlebot is unable to crawl the page due to a restriction.
Usually access denied 403 errors prevent Google from crawling in the following ways:
- The URL is blocked unless visitors log in.
- There are restrictions for the Googlebot in robots.txt.
- Googlebot is being blocked by your hosting provider, or the user has to be authenticated with the proxy.
How important are they?
Whenever a page is blocked for the Googlebot, the error should be promptly fixed. Only in case you don’t want the Googlebot crawl this page, you may let the errors be.
How to fix Access denied errors?
For these errors to be fixed, the blocking element should be deleted.
- Cancel the login request on the page.
- Go through the pages list in your robots.txt.
- Check the file with the robots.txt tester and test each URL separately.
- View your website via Fetch as Google or a specialized plugin to check how it is displayed for the Googlebot.
- Use Screaming Frog to see whether the page requests logging in.
Though not very common, access denied errors may be dangerous for your website, so do not ignore them.
“Not followed” Errors
How important are they?
If the error occurs on crucial pages, it’s important to fix it. Whenever the issue concerns old pages or non-indexed unimportant parameters, it’s not urgent.
How to fix “Not followed” errors?
According to Google, these elements may be crawled by search engines incorrectly:
- Session IDs
Fetch and render your site with Fetch as Google or any other tool to see it as the site is viewed by the Googlebot.
If with this mode enabled you notice that the page doesn’t work correctly, you can easily define which of the abovementioned elements is the issue.
You may use URL Parameters tool to determine specific parameters if needed.
If the error is connected with redirection, do the following:
- Make sure the redirect chains are not too long.
- Optimize your website structure by providing the pages with static links instead of redirects if possible.
Use destination URL within your site map instead of redirected ones.
Server Errors & DNS Errors
The URL errors section is also divided into DNS errors and server errors that should be fixed analogously to the site errors. The only difference is the fact that they concern only certain URLs.
The table below sums up URL errors and may useful for you as a quick tip.
Analyzing and fixing errors can be a monotonous task and large amounts of errors in your crawling report may seem frustrating. However, very soon you will become experienced enough to know how to deal with all sorts of crawl errors.
With the help of such ultimate tool as Google Search Console you will achieve success with your website very soon!
- What are crawl errors?
A crawl error is an issue that occurs when a crawling engine fails to request and index the page or the page returns an error code. Depending on their level, crawl errors can be site Errors and URL Errors. You can view your website’s crawl errors in the corresponding report in Google Search Console.
- How do I fix crawl errors that appear in Google Search Console?
First of all, you need to define what kind of crawl error you have encountered. You can view the list of your site’s errors in the Search Console. Some errors should be solved immediately and some can be safely ignored. In order to learn how to fix each particular error, refer to the information in the article above.
- How to Find Crawl Errors with Google Search Console?
The Crawl Errors section of Google Search Console can be accessed via the main dashboard. You will find there a list of all crawl errors from your website sorted by categories. The most important errors on the list are displayed in the first place.
- How to Bulk Fix 404 Crawl Errors from Google Search Console?
Some crawl errors including 404s may remain on the list even if they actually do not show up anymore. So if they have been resolved or you do not observe them for a long time, you can just mark all of them as fixed. If the root of the problem still exists, those errors will just show up later.
- How to download 404 errors from Google Search Console with Linking pages?
You should open the URL Errors subsection of the Search Console. Then select the required URL so that you can view the links to the error page of your website. Use Google API Explorer to download the file and fix it.
- How to monitor and fix URL crawl errors?
You can monitor URL errors via the Search Console. Some old resolved errors may remain on the list, so you can mark all as fixed and check which of them show up again in several days. The instructions of dealing with each particular error you can find in the article above.
- How to Use Google Search Console to Improve Your Website?
Google Search Console allows you to manage your website and collect all the required data about it. With this service you can analyze different aspects of your website, work with site maps, monitor crawl errors and find out how to fix them.