data:image/s3,"s3://crabby-images/3a182/3a182df847800328b841d709412d0e98b04e3309" alt="find broken links using selenium"
Hi everyone welcomes back once again to Selenium webdriver tutorial. Today we are going to cover the very basic check that is a must in every application check. We will find broken links using selenium and how we can check what is the status of the same.
What is find broken links using selenium
By the name itself, we can identify that we need to find broken links using selenium it means we need to check the link which is pointing to wrong URL or invalid URL.
I am sure you must have faced 404 page not found an issue in most of the application which is called broken link.
It does not only link you may also have to verify the images as well that we will see in the next tutorial.
While doing validation you only have to verify status
1- 200- Success- ok
Scenario for find broken links using selenium
Before jumping to the code let’s take one simple example to get the actual concept.
Example1- Suppose we have one application which contains 400 links and we need to verify the link is broken or not.
Approach 1-
Manual Process- Go to each link and verify the link is working or not.
Do you think it is the valid approach? No, it will take a full day to verify and you will not get the same efficiency and interest as well.
Approach 2-
Smart work- Write a code which will check all the link and will verify the status as well.
Since we all are smart so we will take Smart work and will see how to find broken links using selenium.
Before I start let me introduce HttpURLConnection class which will help us to verify the status of the response.
Precondition-
1- Selenium setup should be completed
Refer complete video find broken links using selenium
Program for find broken links using selenium
import java.net.HttpURLConnection; import java.net.URL; import java.util.List; import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.firefox.FirefoxDriver; public class VerifyLinks { public static void main(String[] args) { WebDriver driver=new FirefoxDriver(); driver.manage().window().maximize(); driver.get("http://www.google.co.in/"); List<WebElement> links=driver.findElements(By.tagName("a")); System.out.println("Total links are "+links.size()); for(int i=0;i<links.size();i++) { WebElement ele= links.get(i); String url=ele.getAttribute("href"); verifyLinkActive(url); } } public static void verifyLinkActive(String linkUrl) { try { URL url = new URL(linkUrl); HttpURLConnection httpURLConnect=(HttpURLConnection)url.openConnection(); httpURLConnect.setConnectTimeout(3000); httpURLConnect.connect(); if(httpURLConnect.getResponseCode()==200) { System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage()); } if(httpURLConnect.getResponseCode()==HttpURLConnection.HTTP_NOT_FOUND) { System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage() + " - "+ HttpURLConnection.HTTP_NOT_FOUND); } } catch (Exception e) { } } }
Output
You can see we are getting 49 links and all seems OK.
Congrats 🙂
I hope you can understand how easy and how important to verify the link and images. You can try the above program and let me know if any issue in above program.
I tried the above code . It is only finding the total links but not iterating to get all the links and the status for each when i ran.
What might be the issue could you let me know.
Hi Ajit, can you share the console output.
how to perform automation when links are in excel file and I want to test broken links
Hi Manjusha,
Iterate in excel to read all links and add them into some data structure(like List). Then check for each URL using http library.
Do you have this in c#
Hi Ahmed,
Not as of now but I’ll post it soon…:)
Any way you have a demonstration of this done in python?
Hi Colin,
Not yet, but I’m planning to post it soon…:)
brother i need to create a app for find a number of broken link in a page using ember . How i do that??
Hi Mathi,
Apologies, I never worked on ember…:(
Hi,
I am new into coding.. Why we used try and catch to validate the response code?
HI Neha,
Please refer this link https://www.javatpoint.com/try-catch-block
I am getting unknownhostexception when running this program
Hi Karthik,
Are you running this code behind some proxy or firewall? Also check whether it is http or htpp secured url.
Hello Mukesh
Nice blog…..
I have a query on above, can i use the same code while automating Mobile Web page or i need to change somthing in it.
I have used same it is not executing while getting “href” in below line:
String url=ele.getAttribute(“href”);
Kindly help me.
Hi Prakash,
On mobile web browser also, you can use same code.
Hello,
I am having total links are 54
and only 4 links are printed in the console. Why is that?
INFO: Detected dialect: OSS
Total links are 54
https://www.kaffekapslen.dk/til-nespresso.html – OK
https://www.kaffekapslen.dk/til-dolce-gusto.html – OK
https://www.kaffekapslen.dk/til-tassimo.html – OK
https://www.kaffekapslen.dk/kundeservice-og-kontaktoplysninger – OK
[main] INFO net.serenitybdd.core.Serenity –
Hi Lilia,
Can you please mention what xpath have you used ?
Hi Mukesh,
How to verify the other page links give errors 404 in the website. because this method only verify the few links but others are not found.
Hi Rajnish,
Using Selenium and href attribute, you can only fetch current webpage links. If you want to check links available on other page then you have to navigate to intended page.
Hello Mukesh,
This method is working when broken image gives 404 error page. But what should we do when a website have managed 404 page to standardised error page ?
Hi Girija,
In your case, you may need to use GET/POST request using same HttpURLConnection class followed by verification of response of request.
Hi Mukesh,
Can you please make detail video of Web Services testing(Soupui and other related concepts) ? this will be very helpful.
Thanks in advance.
Hi Rohit,
I’ll post corresponding videos soon…:)
Hi Mukesh,
How can we generate report have these columns s.no , Page name, Link , status ?
Regards,
Tauseef
Hi Tauseef,
To achieve your objective, you have use looping mechanism so that in every iteration, Serial Number comes. For Page Name depends on your naming convention, Page Link get be acquired using driver.getCurrentUrl() and Status you can return as per execution flow end.
Hi Mukesh, I am facing issue with appium installation I have downloaded all the tools and jar files and installed but padnet, genymotion, appiumserver are not working in my system and selenium web driver is going fine
Can you please help about this issues if you have done vedeos for this please share.
I am following your vedeos and past from 3montgs it’s good and helped soo much…thanks
Hi Pradeep,
I’ll post few more videos very soon…:)
hi mukesh , pls make a video for jmeter
Hi Preethi,
Its already in pipeline. Please stay tuned…:)
Hi , what is good approach to handle stale element reference exceptions
Hi Richa,
Please refer this link http://learn-automation.com/how-to-solve-stale-element-reference-exception-in-selenium-webdriver/
Why is the method verifyLinkActive public? I give methods the smallest scope needed, which would in this case be private. Is it necessary that other classes can use this method too?
Hi Daniel,
For the sake of simplicity, I made it as public otherwise when you use it into framework then you can implement proper access specifier for any method.
Hi Mukesh,
Few questions.
1) Opening google via Headless is working fine. But not able to open Secure sites – Any solution to it please?
2) We are suing Selenium 3.X and ROBOT framework. Got to know htmlunit driver is not a part of the package.
Do we have to add the PATH of htmlunit driver (downloaded separately) to PATH environments variable?
Hi Anshul,
Yes, you need to add htmlUnit driver separately to your project. Please check this link for more info
Hi Mukesh, how can I verify a text is highlighted or not using Selenium Webdriver.
Hi Hash,
Please use JavaScriptExecutor to verify background/foreground color verification.
Hi Mukesh
In my application first i have to login then i have to find the broken link for entire application. Could he please suggest me how to do this?
Hi Partho,
I can’t see any difficulties in your mentioned scenario. Login window is usual and very generic activity. And once you login, then you can find all broken links. If this is not what you meant then kindly elaborate your requirement.
Hi Mukesh,
Im getting java.net.UnknownHostException this exception while running the code. Is it because of proxy issue? Please help.
Thanks
Sri Datta
Yes Please check proxy setting.
Hi Mukesh,
The videos and the blog you have are great. I’m new in automation and I’m learning a lot from you. Very informative and very easy to understand. This code is working great, too. I just have a question maybe you can help me with that.
Is there a way to test links only in the content of website? Can we exclude the header and footer links? Do you think it is possible?
Thank you again!
Hi Vahe,
yes, it is possible to test links available in content of website instead of Header and Footer.
mera urlconnection nahi utha raha can u plz help me???
Hi Gursimran,
Could you please explain this?
Hello Mukesh,
Good Job Mukesh,
Mukesh I am getting ‘IndexOutOfBondException’ using this code if my web page having more than 300 links. Can you please suggest me what I can do in this case ?
Hi Saurabh,
This is java exception which when you refer an index in array which is not available. Or else you can take any List implementation like ArrayList or LinkedList whose length is dynamic in nature. Hope this should solve your problem.
Mukesh,
I’m very new to Selenium. i just came across your blogs and articles, it is quite amazing and very useful. So im interested to do the POC of my project in Selenium (Currently we are using QTP). i have many doubts by comparing QTP. Can you help me on this. If i get your email id, so that it will be helpful to share my doubts.
Hi Vimal,
Are you done with your POC?
Hi Mukesh,
when i try to find the broken links in my application, im getting following error.
java.net.ConnectException: Connection refused: connect. Can you please help me on this.
Hi Vimal,
It comes with proxy so try to setup proxy and then run the same program.
Hi Mukesh,
I have doubt in the collection of link object. When i try to find the list of link from the page where the attribute is “A”. But in the collection of objects, am getting some null value for few anchor tag. how do we skip that.
Hi Vimal,
You can apply one more condition if a href is null then skip is null.
Hi Mukesh,
very nice explanation
Q- Can the same code be used to find response code 404?
Also , there is a separate lib for HttpResponse in Apache, Can it be used. Which is more easier Apache lib or yours method
Both are same I downloaded separately but if you see it also comes with Selenium bundle.
yes we can do that 🙂
kya bath kya bath
sir,
How to use Assertion After finding Broken Links and Images
Hey Bhaskar,
Check below video for assertion https://www.youtube.com/watch?v=dvUPSnGQZCM&feature=autoshare
Hi Mukesh,
public static void verifyLinkActive(String linkUrl)
{
try
{
URL url = new URL(linkUrl);
What does linkURL does
Hi Shreyas,
When you access any url using network URL class will be used.
Hi mukesh,thanks for this post,it is working fine for me.and keep posting this type of videos.
Hey Naresh,
Cheers 🙂
How to find broken images and links on a site on all pages.
Hi Suhana,
Kindly use src instead of href and use img tag. Rest piece of code will remain same.
Hi Mukesh,
Well explained. Can you please share the tutorial on how to find the broken Images on the web page.
Much thanks in advance
Hi Shantosh,
Thanks same code will work only make changes in getAttribute. Try src attribute.
It is checking the VeryFirst Link, After that is not checking the other links !
set proxy and run again.
hi mukesh,
when i am running your code then i got 48 link count.
Actual link present in consol-44
If i run using Testng by removing main method and set priority to both method then i got error.
org.testng.TestNGException:
Method verifyLinkActive requires 1 parameters but 0 were supplied in the @Test annotation.
Hi Harshal,
please add parameter in method as well.
Well explained ..always been a fan of your tutorials.. Gr8 job
Thanks 🙂 Neha
I get this exception when I your code at Eclipse
Exception in thread “main” java.lang.NoClassDefFoundError: com/google/common/base/Function
at verifyBrokenLinks.main(verifyBrokenLinks.java:15)
Caused by: java.lang.ClassNotFoundException: com.google.common.base.Function
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
… 1 more
Hi Seleniu jars are not added completly.
Hi Mukesh,
Can you explain why there is discrepancies in total no. of links. It always comes out to be more in no.
refer Image : http://s15.postimg.org/hxsuih10b/screenshot_domain_date_time.png
Also I believe in this depth level is limited to visited Page.
Hi Abhishek,
it might due to a tag.
Hi Mukesh,
Thank you for listing this amazing stuff on your blog. I really love the way you explore things that are so genuine and are in for everyday use. Thank you once again!