How to prevent hangs on SocketInputStream.socketRead0 in Java?

Piotr Müller picture Piotr Müller · Feb 28, 2015 · Viewed 40.3k times · Source

Performing millions of HTTP requests with different Java libraries gives me threads hanged on:

java.net.SocketInputStream.socketRead0()

Which is native function.

I tried to set up Apche Http Client and RequestConfig to have timeouts on (I hope) everythig that is possible but still, I have (probably infinite) hangs on socketRead0. How to get rid of them?

Hung ratio is about ~1 per 10000 requests (to 10000 different hosts) and it can last probably forever (I've confirmed thread hung as still valid after 10 hours).

JDK 1.8 on Windows 7.

My HttpClient factory:

SocketConfig socketConfig = SocketConfig.custom()
            .setSoKeepAlive(false)
            .setSoLinger(1)
            .setSoReuseAddress(true)
            .setSoTimeout(5000)
            .setTcpNoDelay(true).build();

    HttpClientBuilder builder = HttpClientBuilder.create();
    builder.disableAutomaticRetries();
    builder.disableContentCompression();
    builder.disableCookieManagement();
    builder.disableRedirectHandling();
    builder.setConnectionReuseStrategy(new NoConnectionReuseStrategy());
    builder.setDefaultSocketConfig(socketConfig);

    return HttpClientBuilder.create().build();

My RequestConfig factory:

    HttpGet request = new HttpGet(url);

    RequestConfig config = RequestConfig.custom()
            .setCircularRedirectsAllowed(false)
            .setConnectionRequestTimeout(8000)
            .setConnectTimeout(4000)
            .setMaxRedirects(1)
            .setRedirectsEnabled(true)
            .setSocketTimeout(5000)
            .setStaleConnectionCheckEnabled(true).build();
    request.setConfig(config);

    return new HttpGet(url);

OpenJDK socketRead0 source

Note: Actually I have some "trick" - I can schedule .getConnectionManager().shutdown() in other Thread with cancellation of Future if request finished properly, but it is depracated and also it kills whole HttpClient, not only that single request.

Answer

Trevor Robinson picture Trevor Robinson · Sep 18, 2015

Though this question mentions Windows, I have the same problem on Linux. It appears there is a flaw in the way the JVM implements blocking socket timeouts:

To summarize, timeout for blocking sockets is implemented by calling poll on Linux (and select on Windows) to determine that data is available before calling recv. However, at least on Linux, both methods can spuriously indicate that data is available when it is not, leading to recv blocking indefinitely.

From poll(2) man page BUGS section:

See the discussion of spurious readiness notifications under the BUGS section of select(2).

From select(2) man page BUGS section:

Under Linux, select() may report a socket file descriptor as "ready for reading", while nevertheless a subsequent read blocks. This could for example happen when data has arrived but upon examination has wrong checksum and is discarded. There may be other circumstances in which a file descriptor is spuriously reported as ready. Thus it may be safer to use O_NONBLOCK on sockets that should not block.

The Apache HTTP Client code is a bit hard to follow, but it appears that connection expiration is only set for HTTP keep-alive connections (which you've disabled) and is indefinite unless the server specifies otherwise. Therefore, as pointed out by oleg, the Connection eviction policy approach won't work in your case and can't be relied upon in general.