Java - Quickest way to check if URL exists

Matt9Atkins picture Matt9Atkins · Aug 8, 2013 · Viewed 27.8k times · Source

Hi I am writing a program that goes through many different URLs and just checks if they exist or not. I am basically checking if the error code returned is 404 or not. However as I am checking over 1000 URLs, I want to be able to do this very quickly. The following is my code, I was wondering how I can modify it to work quickly (if possible):

final URL url = new URL("http://www.example.com");
HttpURLConnection huc = (HttpURLConnection) url.openConnection();
int responseCode = huc.getResponseCode();

if (responseCode != 404) {
System.out.println("GOOD");
} else {
System.out.println("BAD");
}

Would it be quicker to use JSoup?

I am aware some sites give the code 200 and have their own error page, however I know the links that I am checking dont do this, so this is not needed.

Answer

Vishnuprasad R picture Vishnuprasad R · Aug 8, 2013

Try sending a "HEAD" request instead of get request. That should be faster since the response body is not downloaded.

huc.setRequestMethod("HEAD");

Again instead of checking if response status is not 400, check if it is 200. That is check for positive instead of negative. 404,403,402.. all 40x statuses are nearly equivalent to invalid non-existant url.

You may make use of multi-threading to make it even faster.