How to scan complete website for broken links for free

If you manage a large blog or website with multiple pages, then chances are there there might be some broken links within your website. This may accumulate over a period of time and may negatively impact the SEO scores of your website ultimately impacting your search rankings in major search engines.

Testing every web page manually for broken links may appear easy at first, but becomes time consuming and boring when there are numerous pages to be checked.

To automate this process, there are some free online tools available over the internet which can scan your website for broken links such as 404 errors. But the catch is, most of those free tools tend to scan only limited links say a 100 or 200 links, and beyond that, it is usually a paid service. 

In order to solve this, I came across a cool solution using a python based utility called LinkChecker using which you can scan your whole website without worrying about the number limit of links to be scanned. It can recursively scan the whole website without your intervention and will also display the list of links having errors.


To scan your website for broken links, follow these steps:


Prerequisite: Ensure you have Python 3 installed on your machine. If not then you can install it using Anaconda Distribution which automatically installs Python on your machine along with some Python IDEs like Jupyter or Spyder. (Though we may not require these IDEs for our purpose here)


Open your terminal or Anaconda Command Prompt (Powershell) and install Linkchecker. Use the command:

pip install linkchecker 

 

Once it is installed, you can type the below command in your terminal to scan the website.

linkchecker https://www.websitename.com

Replace the website name with your website or blog's URL.




This command will start the Linkchecker utility which will start scanning your website for broken links in multiple threads. The console will also display a list of links that are having errors along with other statistical parameters such as the number of links currently in the queue and the ones already scanned.

Moreover, if you want to know the details of the link scanned, then you can use verbose mode in the command as shown below:

linkchecker http://www.website.com -v

Similarly, if you want to log all the links in any specified format such as CSV or HTML, you can also specify the output format as per below example:

linkchecker http://www.website.com -v --output=csv

 

The complete list of options can be found on their documentation page.

 

I hope you find this tutorial helpful! Let me know your thoughts

Post a Comment

0 Comments