InfoHeap
Tech
Navigation
  • Home
  • Tutorials
    • CSS tutorials & examples
    • CSS properties
    • Javascript cookbook
    • Linux/Unix Command Line
    • Mac
    • PHP
      • PHP functions online
      • PHP regex
    • WordPress
  • Online Tools
    • Text utilities
    • Online Lint Tools
search

SEO tutorials

  • How to change twitter handle and its impact on traffic
  • How to do keyword traffic research using adword
  • How to make best use of Alexa
  • How to migrate your site from one domain to another
  • How to monitor 404 pages on your site
  • How to remove urls from Google index using webmaster tools
  • How to undo HTTP 301 site/domain redirect
  • Verify a web site in google search console
  • Wordpress SEO - beginner guide
  • Xml sitemap - quick introduction
  • robots.txt
  • www vs non-www domain which is better for your site?
 
  • Home
  • > Tutorials
  • > Web Development
  • > SEO

How to monitor 404 pages on your site

By admin | Last updated on Mar 20, 2016

One of the main activity a webmaster has to perform is to monitor and fix 404 pages (not found pages) on the web sites. When your web server is not able to find a page on your web site, it returns HTTP 404 status code. You can find more about all http status codes on wikipedia List_of_HTTP_status_codes page.

404 page example

To see what headers are returned from a server you can use netcat (nc). e.g. Run this command on a Linux/Mac terminal (or some equivalent command on windows):

printf "GET /foo/non-existing-page/ HTTP/1.1\nHost: www.google.com\n\n" | nc www.google.com 80

Here is the outcome from the command:

HTTP/1.1 404 Not Found
Content-Type: text/html; charset=UTF-8
X-Content-Type-Options: nosniff
Date: Thu, 11 Apr 2013 07:38:31 GMT
Server: sffe
Content-Length: 953
X-XSS-Protection: 1; mode=block

<!DOCTYPE html>
<html lang=en>
  <meta charset=utf-8>
  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
  <title>Error 404 (Not Found)!!1</title>
  <style>
    *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}
  </style>
  <a href=//www.google.com/><img src=//www.google.com/images/errors/logo_sm.gif alt=Google></a>
  <p><b>404.</b> <ins>That’s an error.</ins>
  <p>The requested URL <code>/foo/non-existing-page/</code> was not found on this server.  <ins>That’s all we know.</ins>

Reasons for 404 pages

There can be multiple reasons for 404 pages. Some of these are:

  1. Wrong link in pages
  2. Some page might have moved or have been deleted
  3. Some external site might have put wrong link on some of your pages.

Using apache/webserver log to find 404 pages

One option to monitor 404 pages is to regularly check apache log. Here is how one such log entry looks like:

122.167.14.16 - - [11/Apr/2013:07:06:43 +0000] "GET /non-existing-page/ HTTP/1.1" 404 16003 "-" "Mozilla/5.0 (iPhone; U; CPU iPhone OS 431 like Mac OS X; en-US) AppleWebKit/533.17.9 (KHTML like Gecko) Version/5.0.2 Mobile/8G4 Safari/6533.18."

Using Google webmaster tools to find 404 pages

Google webmaster tools provides excellent report on 404 (error pages) on your web site. This can be used to find such pages and take appropriate corrective action. To see crawl error (404 pages) login to Google webmaster tools and go to “Health” -> “Crawl Errors” as shown below:
webmaster-tools-crawl-errorsThe tool lists all the error page urls and date it tried to fetch them. Once you fix a page you can mark it fixed. That helps you keep track of such pages.

Suggested posts:

  1. Traceroute outcome from Bangalore to AWS Virginia and California
  2. How to setup ssl (https) for your site on Ubuntu Linux
  3. Verify a web site in google search console (webmaster tools)
  4. Using python to analyze bots from apache logs
  5. How to display wordpress page list with specific custom field value
  6. Handle XSS restriction using different domain for user entered javascript
  7. Chrome extension tutorial – hello world
  8. PHP apc – setup and performance benchmarks on Ubuntu Linux
Share this article: share on facebook share on linkedin tweet this submit to reddit
Posted in Tutorials | Tagged SEO, Tutorials, Webmaster
  • Browse content
  • Article Topics
  • Article archives
  • Contact Us
Popular Topics: Android Development | AngularJS | Apache | AWS and EC2 | Bash shell scripting | Chrome developer tools | Company results | CSS | CSS cookbook | CSS properties | CSS Pseudo Classes | CSS selectors | CSS3 | CSS3 flexbox | Devops | Git | HTML | HTML5 | Java | Javascript | Javascript cookbook | Javascript DOM | jQuery | Kubernetes | Linux | Linux/Unix Command Line | Mac | Mac Command Line | Mysql | Networking | Node.js | Online Tools | PHP | PHP cookbook | PHP Regex | Python | Python array | Python cookbook | SEO | Site Performance | SSH | Ubuntu Linux | Web Development | Webmaster | Wordpress | Wordpress customization | Wordpress How To | Wordpress Mysql Queries | InfoHeap Money

Copyright © 2025 InfoHeap.

Powered by WordPress