Apache HttpClient response content length returns -1

PeterMoron picture PeterMoron · Sep 10, 2013 · Viewed 12.9k times · Source

Why does the following Code returns -1? Seems that the request failed.

public static void main(String[] args)
{
    DefaultHttpClient httpClient = new DefaultHttpClient();
    HttpGet httpGet = new HttpGet("http://www.google.de");

    HttpResponse response;
    try
    {
        response = httpClient.execute(httpGet);
        HttpEntity entity = response.getEntity();
        EntityUtils.consume(entity);

        // Prints -1
        System.out.println(entity.getContentLength());
    }
    catch (ClientProtocolException e)
    {
        e.printStackTrace();
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
    finally
    {
        httpGet.releaseConnection();
    }
}

And is it possible to get the response as String?

Answer

Sotirios Delimanolis picture Sotirios Delimanolis · Sep 10, 2013

Try running

Header[] headers = response.getAllHeaders();
for (Header header : headers) {
    System.out.println(header);
}

It will print

Date: Tue, 10 Sep 2013 19:10:04 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=dad7e2356ddb3b7a:FF=0:TM=1378840204:LM=1378840204:S=vQcLzVPbOOTxfvL4; expires=Thu, 10-Sep-2015 19:10:04 GMT; path=/; domain=.google.de
Set-Cookie: NID=67=S11HcqAV454IGRGMRo-AJpxAPxClJeRs4DRkAJQ5vI3YBh4anN3qS0EVeiYX_4XDTGN-mY86xTBoJ3Ncca7eNSdtGjcaG31pbCOuqsZEQMWwKn-7-6Dnizx395snehdA; expires=Wed, 12-Mar-2014 19:10:04 GMT; path=/; domain=.google.de; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked

This is not a problem, the page you requested simply doesn't provide a Content-Length header in its response. As such, the HttpEntity#getContentLength() returns -1.

EntityUtils has a number of methods, some of which return a String.


Running curl more recently produces

> curl --head http://www.google.de
HTTP/1.1 200 OK
Date: Fri, 03 Apr 2020 15:38:18 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2020-04-03-15; expires=Sun, 03-May-2020 15:38:18 GMT; path=/; domain=.google.de; Secure
Set-Cookie: NID=201=H8GdKY8_vE5Ehy6qSkmQru13HqdGEj2tvZUFqvTDAVBxFoL4POI0swPtfI45v1TBjrJuAAfbcNMUddniIf9HHituCAFwUqmUFMDwxDYK5qUlcWiB1A64OcGp6PTT6LKur2r_3z-ToSvLf8RZhKWdny6E8SaArMpkaOqUEWp4aoQ; expires=Sat, 03-Oct-2020 15:38:18 GMT; path=/; domain=.google.de; HttpOnly
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding

The headers contain a Transfer-Encoding value of chunked. With chunked, the response contains "chunks" preceded by their length. An HTTP client uses those to read the entire response.

The HTTP Specification states that the Content-Length header should not be present when Transfer-Encoding has a value of chunked and MUST be ignored if it is.