Hi everyone welcomes back once again to Selenium webdriver tutorial. Today we are going to cover the very basic check that is a must in every application check. We will find broken links using selenium and how we can check what is the status of the same.
What is find broken links using selenium
By the name itself, we can identify that we need to find broken links using selenium it means we need to check the link which is pointing to wrong URL or invalid URL.
I am sure you must have faced 404 page not found an issue in most of the application which is called broken link.
It does not only link you may also have to verify the images as well that we will see in the next tutorial.
While doing validation you only have to verify status
1- 200- Success- ok
Scenario for find broken links using selenium
Before jumping to the code let’s take one simple example to get the actual concept.
Example1- Suppose we have one application which contains 400 links and we need to verify the link is broken or not.
Approach 1-
Manual Process- Go to each link and verify the link is working or not.
Do you think it is the valid approach? No, it will take a full day to verify and you will not get the same efficiency and interest as well.
Approach 2-
Smart work- Write a code which will check all the link and will verify the status as well.
Since we all are smart so we will take Smart work and will see how to find broken links using selenium.
Before I start let me introduce HttpURLConnection class which will help us to verify the status of the response.
Precondition-
1- Selenium setup should be completed
Refer complete video find broken links using selenium
Program for find broken links using selenium
import java.net.HttpURLConnection; import java.net.URL; import java.util.List; import org.openqa.selenium.By; import org.openqa.selenium.WebDriver; import org.openqa.selenium.WebElement; import org.openqa.selenium.firefox.FirefoxDriver; public class VerifyLinks { public static void main(String[] args) { WebDriver driver=new FirefoxDriver(); driver.manage().window().maximize(); driver.get("http://www.google.co.in/"); List<WebElement> links=driver.findElements(By.tagName("a")); System.out.println("Total links are "+links.size()); for(int i=0;i<links.size();i++) { WebElement ele= links.get(i); String url=ele.getAttribute("href"); verifyLinkActive(url); } } public static void verifyLinkActive(String linkUrl) { try { URL url = new URL(linkUrl); HttpURLConnection httpURLConnect=(HttpURLConnection)url.openConnection(); httpURLConnect.setConnectTimeout(3000); httpURLConnect.connect(); if(httpURLConnect.getResponseCode()==200) { System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage()); } if(httpURLConnect.getResponseCode()==HttpURLConnection.HTTP_NOT_FOUND) { System.out.println(linkUrl+" - "+httpURLConnect.getResponseMessage() + " - "+ HttpURLConnection.HTTP_NOT_FOUND); } } catch (Exception e) { } } }
Output
You can see we are getting 49 links and all seems OK.
Congrats 🙂
I hope you can understand how easy and how important to verify the link and images. You can try the above program and let me know if any issue in above program.
Ajit Nayak says
I tried the above code . It is only finding the total links but not iterating to get all the links and the status for each when i ran.
What might be the issue could you let me know.
Mukesh Otwani says
Hi Ajit, can you share the console output.
Manjusha says
how to perform automation when links are in excel file and I want to test broken links
Mukesh Otwani says
Hi Manjusha,
Iterate in excel to read all links and add them into some data structure(like List). Then check for each URL using http library.
ahmed nur says
Do you have this in c#
Mukesh Otwani says
Hi Ahmed,
Not as of now but I’ll post it soon…:)
Collin Code says
Any way you have a demonstration of this done in python?
Mukesh Otwani says
Hi Colin,
Not yet, but I’m planning to post it soon…:)
Mathi says
brother i need to create a app for find a number of broken link in a page using ember . How i do that??
Mukesh Otwani says
Hi Mathi,
Apologies, I never worked on ember…:(
Neha says
Hi,
I am new into coding.. Why we used try and catch to validate the response code?
Mukesh Otwani says
HI Neha,
Please refer this link https://www.javatpoint.com/try-catch-block
Karthik says
I am getting unknownhostexception when running this program
Mukesh Otwani says
Hi Karthik,
Are you running this code behind some proxy or firewall? Also check whether it is http or htpp secured url.
Prakash chandra Khulbey says
Hello Mukesh
Nice blog…..
I have a query on above, can i use the same code while automating Mobile Web page or i need to change somthing in it.
I have used same it is not executing while getting “href” in below line:
String url=ele.getAttribute(“href”);
Kindly help me.
Mukesh Otwani says
Hi Prakash,
On mobile web browser also, you can use same code.
Lilia says
Hello,
I am having total links are 54
and only 4 links are printed in the console. Why is that?
INFO: Detected dialect: OSS
Total links are 54
https://www.kaffekapslen.dk/til-nespresso.html – OK
https://www.kaffekapslen.dk/til-dolce-gusto.html – OK
https://www.kaffekapslen.dk/til-tassimo.html – OK
https://www.kaffekapslen.dk/kundeservice-og-kontaktoplysninger – OK
[main] INFO net.serenitybdd.core.Serenity –
Mukesh Otwani says
Hi Lilia,
Can you please mention what xpath have you used ?
Rajnish says
Hi Mukesh,
How to verify the other page links give errors 404 in the website. because this method only verify the few links but others are not found.
Mukesh Otwani says
Hi Rajnish,
Using Selenium and href attribute, you can only fetch current webpage links. If you want to check links available on other page then you have to navigate to intended page.
Girija says
Hello Mukesh,
This method is working when broken image gives 404 error page. But what should we do when a website have managed 404 page to standardised error page ?
Mukesh Otwani says
Hi Girija,
In your case, you may need to use GET/POST request using same HttpURLConnection class followed by verification of response of request.
Rohit says
Hi Mukesh,
Can you please make detail video of Web Services testing(Soupui and other related concepts) ? this will be very helpful.
Thanks in advance.
Mukesh Otwani says
Hi Rohit,
I’ll post corresponding videos soon…:)
Tauseef says
Hi Mukesh,
How can we generate report have these columns s.no , Page name, Link , status ?
Regards,
Tauseef
Mukesh Otwani says
Hi Tauseef,
To achieve your objective, you have use looping mechanism so that in every iteration, Serial Number comes. For Page Name depends on your naming convention, Page Link get be acquired using driver.getCurrentUrl() and Status you can return as per execution flow end.
Pradeep says
Hi Mukesh, I am facing issue with appium installation I have downloaded all the tools and jar files and installed but padnet, genymotion, appiumserver are not working in my system and selenium web driver is going fine
Can you please help about this issues if you have done vedeos for this please share.
I am following your vedeos and past from 3montgs it’s good and helped soo much…thanks
Mukesh Otwani says
Hi Pradeep,
I’ll post few more videos very soon…:)
preethi says
hi mukesh , pls make a video for jmeter
Mukesh Otwani says
Hi Preethi,
Its already in pipeline. Please stay tuned…:)
richa shastri says
Hi , what is good approach to handle stale element reference exceptions
Mukesh Otwani says
Hi Richa,
Please refer this link http://learn-automation.com/how-to-solve-stale-element-reference-exception-in-selenium-webdriver/
Daniel says
Why is the method verifyLinkActive public? I give methods the smallest scope needed, which would in this case be private. Is it necessary that other classes can use this method too?
Mukesh Otwani says
Hi Daniel,
For the sake of simplicity, I made it as public otherwise when you use it into framework then you can implement proper access specifier for any method.
Anshul Rajvanshi says
Hi Mukesh,
Few questions.
1) Opening google via Headless is working fine. But not able to open Secure sites – Any solution to it please?
2) We are suing Selenium 3.X and ROBOT framework. Got to know htmlunit driver is not a part of the package.
Do we have to add the PATH of htmlunit driver (downloaded separately) to PATH environments variable?
Mukesh Otwani says
Hi Anshul,
Yes, you need to add htmlUnit driver separately to your project. Please check this link for more info
Hash says
Hi Mukesh, how can I verify a text is highlighted or not using Selenium Webdriver.
Mukesh Otwani says
Hi Hash,
Please use JavaScriptExecutor to verify background/foreground color verification.
Partho Dutta says
Hi Mukesh
In my application first i have to login then i have to find the broken link for entire application. Could he please suggest me how to do this?
Mukesh Otwani says
Hi Partho,
I can’t see any difficulties in your mentioned scenario. Login window is usual and very generic activity. And once you login, then you can find all broken links. If this is not what you meant then kindly elaborate your requirement.
Sri Datta says
Hi Mukesh,
Im getting java.net.UnknownHostException this exception while running the code. Is it because of proxy issue? Please help.
Thanks
Sri Datta
Mukesh Otwani says
Yes Please check proxy setting.
Vahe says
Hi Mukesh,
The videos and the blog you have are great. I’m new in automation and I’m learning a lot from you. Very informative and very easy to understand. This code is working great, too. I just have a question maybe you can help me with that.
Is there a way to test links only in the content of website? Can we exclude the header and footer links? Do you think it is possible?
Thank you again!
Mukesh Otwani says
Hi Vahe,
yes, it is possible to test links available in content of website instead of Header and Footer.
Gursimran singh says
mera urlconnection nahi utha raha can u plz help me???
Mukesh Otwani says
Hi Gursimran,
Could you please explain this?
Saurabh says
Hello Mukesh,
Good Job Mukesh,
Mukesh I am getting ‘IndexOutOfBondException’ using this code if my web page having more than 300 links. Can you please suggest me what I can do in this case ?
Mukesh Otwani says
Hi Saurabh,
This is java exception which when you refer an index in array which is not available. Or else you can take any List implementation like ArrayList or LinkedList whose length is dynamic in nature. Hope this should solve your problem.
Vimal says
Mukesh,
I’m very new to Selenium. i just came across your blogs and articles, it is quite amazing and very useful. So im interested to do the POC of my project in Selenium (Currently we are using QTP). i have many doubts by comparing QTP. Can you help me on this. If i get your email id, so that it will be helpful to share my doubts.
Mukesh Otwani says
Hi Vimal,
Are you done with your POC?
Vimal says
Hi Mukesh,
when i try to find the broken links in my application, im getting following error.
java.net.ConnectException: Connection refused: connect. Can you please help me on this.
Mukesh Otwani says
Hi Vimal,
It comes with proxy so try to setup proxy and then run the same program.
Vimal says
Hi Mukesh,
I have doubt in the collection of link object. When i try to find the list of link from the page where the attribute is “A”. But in the collection of objects, am getting some null value for few anchor tag. how do we skip that.
Mukesh Otwani says
Hi Vimal,
You can apply one more condition if a href is null then skip is null.
Aman says
Hi Mukesh,
very nice explanation
Q- Can the same code be used to find response code 404?
Aman says
Also , there is a separate lib for HttpResponse in Apache, Can it be used. Which is more easier Apache lib or yours method
Mukesh Otwani says
Both are same I downloaded separately but if you see it also comes with Selenium bundle.
Mukesh Otwani says
yes we can do that 🙂
bhaskar says
kya bath kya bath
sir,
How to use Assertion After finding Broken Links and Images
Mukesh Otwani says
Hey Bhaskar,
Check below video for assertion https://www.youtube.com/watch?v=dvUPSnGQZCM&feature=autoshare
shreyas says
Hi Mukesh,
public static void verifyLinkActive(String linkUrl)
{
try
{
URL url = new URL(linkUrl);
What does linkURL does
Mukesh Otwani says
Hi Shreyas,
When you access any url using network URL class will be used.
Naresh says
Hi mukesh,thanks for this post,it is working fine for me.and keep posting this type of videos.
Mukesh Otwani says
Hey Naresh,
Cheers 🙂
suhana says
How to find broken images and links on a site on all pages.
Mukesh Otwani says
Hi Suhana,
Kindly use src instead of href and use img tag. Rest piece of code will remain same.
Shantosh says
Hi Mukesh,
Well explained. Can you please share the tutorial on how to find the broken Images on the web page.
Much thanks in advance
Mukesh Otwani says
Hi Shantosh,
Thanks same code will work only make changes in getAttribute. Try src attribute.
Jayshreekant says
It is checking the VeryFirst Link, After that is not checking the other links !
Mukesh Otwani says
set proxy and run again.
Harshal says
hi mukesh,
when i am running your code then i got 48 link count.
Actual link present in consol-44
If i run using Testng by removing main method and set priority to both method then i got error.
org.testng.TestNGException:
Method verifyLinkActive requires 1 parameters but 0 were supplied in the @Test annotation.
Mukesh Otwani says
Hi Harshal,
please add parameter in method as well.
Neha says
Well explained ..always been a fan of your tutorials.. Gr8 job
Mukesh Otwani says
Thanks 🙂 Neha
c. says
I get this exception when I your code at Eclipse
Exception in thread “main” java.lang.NoClassDefFoundError: com/google/common/base/Function
at verifyBrokenLinks.main(verifyBrokenLinks.java:15)
Caused by: java.lang.ClassNotFoundException: com.google.common.base.Function
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
… 1 more
Mukesh Otwani says
Hi Seleniu jars are not added completly.
Abhishek Gupta says
Hi Mukesh,
Can you explain why there is discrepancies in total no. of links. It always comes out to be more in no.
refer Image : http://s15.postimg.org/hxsuih10b/screenshot_domain_date_time.png
Also I believe in this depth level is limited to visited Page.
Mukesh Otwani says
Hi Abhishek,
it might due to a tag.
Nitin says
Hi Mukesh,
Thank you for listing this amazing stuff on your blog. I really love the way you explore things that are so genuine and are in for everyday use. Thank you once again!