URLConnection FileNotFoundException for non-standard HTTP port sources

jeffl8n picture jeffl8n · Jun 2, 2009 · Viewed 51.4k times · Source

I was trying to use the Apache Ant Get task to get a list of WSDLs generated by another team in our company. They have them hosted on a weblogic 9.x server on http://....com:7925/services/. I am able to get to the page through a browser, but the get task gives me a FileNotFoundException when trying to copy the page to a local file to parse. I was still able to get (using the ant task) a URL without the non-standard port 80 for HTTP.

I looked through the Ant source code, and narrowed the error down to the URLConnection. It seems as though the URLConnection doesn't recognize the data is HTTP traffic, since it isn't on the standard port, even though the protocol is specified as HTTP. I sniffed the traffic using WireShark and the page loads correctly across the wire, but still gets the FileNotFoundException.

Here's an example where you will see the error (with the URL changed to protect the innocent). The error is thrown on connection.getInputStream();

import java.io.File;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;

    public class TestGet {
    private static URL source; 
    public static void main(String[] args) {
        doGet();
    }
    public static void doGet() {
            try {
            source = new URL("http", "test.com", 7925,
                    "/services/index.html");
            URLConnection connection = source.openConnection();
            connection.connect();
            InputStream is = connection.getInputStream();
        } catch (Exception e) {
            System.err.println(e.toString());
        }
    }

}

Answer

bcody picture bcody · Nov 9, 2010

The response to my HTTP request returned with a status code 404, which resulted in a FileNotFoundException when I called getInputStream(). I still wanted to read the response body, so I had to use a different method: HttpURLConnection#getErrorStream().

Here's a JavaDoc snippet of getErrorStream():

Returns the error stream if the connection failed but the server sent useful data nonetheless. The typical example is when an HTTP server responds with a 404, which will cause a FileNotFoundException to be thrown in connect, but the server sent an HTML help page with suggestions as to what to do.

Usage example:

public static String httpGet(String url) {
    HttpURLConnection con = null;
    InputStream is = null;
    try {
        con = (HttpURLConnection) new URL(url).openConnection();
        con.connect();

        //4xx: client error, 5xx: server error. See: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html.
        boolean isError = con.getResponseCode() >= 400;
        //In HTTP error cases, HttpURLConnection only gives you the input stream via #getErrorStream().
        is = isError ? con.getErrorStream() : con.getInputStream();

        String contentEncoding = con.getContentEncoding() != null ? con.getContentEncoding() : "UTF-8";
        return IOUtils.toString(is, contentEncoding); //Apache Commons IO
    } catch (Exception e) {
        throw new IllegalStateException(e);
    } finally {
        //Note: Closing the InputStream manually may be unnecessary, depending on the implementation of HttpURLConnection#disconnect(). Sun/Oracle's implementation does close it for you in said method.
        if (is != null) {
            try {
                is.close();
            } catch (IOException e) {
                throw new IllegalStateException(e);
            }
        }
        if (con != null) {
            con.disconnect();
        }
    }
}