javax.net.ssl.SSLException: Read error: ssl=0x9524b800: I/O error during system call, Connection reset by peer

Rickster picture Rickster · May 29, 2015 · Viewed 97k times · Source

Our clients are starting to see 100s of these "SSLException error - Connection reset by peer" over the last couple of weeks and I can't figure out why

  1. We're using Retrofit with okhttp, no special configuration

    public class OkHttpClientProvider implements IOkHttpClientProvider {
    
        OkHttpClient okHttpClient;
    
        public OkHttpClientProvider() {
            this.okHttpClient = createClient();
        }
    
        public OkHttpClient getOkHttpClient() {
            return this.okHttpClient;
        }
    
        private OkHttpClient createClient() {
            return new OkHttpClient();
        }
    }
    

The above client provider is a singleton. The RestAdapter is built using this injected client (we use dagger) -

RestAdapter.Builder restAdapterBuilder = new RestAdapter.Builder()
                                        .setConverter(converter)
                                        .setEndpoint(networkRequestDetails.getServerUrl())
                                        .setClient(new OkClient(okHttpClientProvider.getOkHttpClient()))
                                        .setErrorHandler(new NetworkSynchronousErrorHandler(eventBus))
                                        );

Based on stack overflow solutions what I've found out -

  1. The keep alive duration on the server is 180 seconds, OkHttp has a default of 300 seconds

  2. The server returns "Connection: close" in its header but the client request sends "Connection: keepAlive"

  3. The server supports TLS 1.0 / 1.1 / 1.2 and uses Open SSL

  4. Our servers have moved to another hosting provider recently in another geography so I don't know if these are DNS failures or not

  5. We've tried tweaking things like keepAlive, reconfigured OpenSSL on the server but for some reason the Android client keeps getting this error

  6. It happens immediately without any delay when you try to use the app to post something or pull to refresh (it doesn't even go to network or have a delay before this exception happens which would imply the connection is already broken). But trying it multiple times somehow "fixes it" and we get a success. It happens again later

  7. We've invalidated our DNS entries on the server to see if this what caused it but that hasn't helped

  8. It mostly happens on LTE but I've seen it on Wifi as well

I don't want to disable keep alive because most modern clients don't do that. Also we're using OkHttp 2.4 and this is a problem on post Ice cream sandwich devices so I'm hoping it should take care of these underlying networking issues. The iOS client also gets these exceptions but close to a 100 times less (iOS client uses AFNetworking 2.0). I'm struggling to find new things to try at this point, any help / ideas?

Update - Adding full stack trace through okhttp

      retrofit.RetrofitError: Read error: ssl=0x9dd07200: I/O error during system call, Connection reset by peer
              at retrofit.RestAdapter$RestHandler.invokeRequest(RestAdapter.java:390)
              at retrofit.RestAdapter$RestHandler.invoke(RestAdapter.java:240)
              at java.lang.reflect.Proxy.invoke(Proxy.java:397)
              at $Proxy15.getAccessTokenUsingResourceOwnerPasswordCredentials(Unknown Source)
              at com.company.droid.repository.network.NetworkRepository.getAccessTokenUsingResourceOwnerPasswordCredentials(NetworkRepository.java:76)
              at com.company.droid.ui.login.LoginTask.doInBackground(LoginTask.java:88)
              at com.company.droid.ui.login.LoginTask.doInBackground(LoginTask.java:23)
              at android.os.AsyncTask$2.call(AsyncTask.java:292)
              at java.util.concurrent.FutureTask.run(FutureTask.java:237)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
              at java.lang.Thread.run(Thread.java:818)
       Caused by: javax.net.ssl.SSLException: Read error: ssl=0x9dd07200: I/O error during system call, Connection reset by peer
              at com.android.org.conscrypt.NativeCrypto.SSL_read(Native Method)
              at com.android.org.conscrypt.OpenSSLSocketImpl$SSLInputStream.read(OpenSSLSocketImpl.java:699)
              at okio.Okio$2.read(Okio.java:137)
              at okio.AsyncTimeout$2.read(AsyncTimeout.java:211)
              at okio.RealBufferedSource.indexOf(RealBufferedSource.java:306)
              at okio.RealBufferedSource.indexOf(RealBufferedSource.java:300)
              at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:196)
              at com.squareup.okhttp.internal.http.HttpConnection.readResponse(HttpConnection.java:191)
              at com.squareup.okhttp.internal.http.HttpTransport.readResponseHeaders(HttpTransport.java:80)
              at com.squareup.okhttp.internal.http.HttpEngine.readNetworkResponse(HttpEngine.java:917)
              at com.squareup.okhttp.internal.http.HttpEngine.readResponse(HttpEngine.java:793)
              at com.squareup.okhttp.internal.huc.HttpURLConnectionImpl.execute(HttpURLConnectionImpl.java:439)
              at com.squareup.okhttp.internal.huc.HttpURLConnectionImpl.getResponse(HttpURLConnectionImpl.java:384)
              at com.squareup.okhttp.internal.huc.HttpURLConnectionImpl.getResponseCode(HttpURLConnectionImpl.java:497)
              at com.squareup.okhttp.internal.huc.DelegatingHttpsURLConnection.getResponseCode(DelegatingHttpsURLConnection.java:105)
              at com.squareup.okhttp.internal.huc.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:25)
              at retrofit.client.UrlConnectionClient.readResponse(UrlConnectionClient.java:73)
              at retrofit.client.UrlConnectionClient.execute(UrlConnectionClient.java:38)
              at retrofit.RestAdapter$RestHandler.invokeRequest(RestAdapter.java:321)
              at retrofit.RestAdapter$RestHandler.invoke(RestAdapter.java:240)
              at java.lang.reflect.Proxy.invoke(Proxy.java:397)
              at $Proxy15.getAccessTokenUsingResourceOwnerPasswordCredentials(Unknown Source)
              at com.company.droid.repository.network.NetworkRepository.getAccessTokenUsingResourceOwnerPasswordCredentials(NetworkRepository.java:76)
              at com.company.droid.ui.login.LoginTask.doInBackground(LoginTask.java:88)
              at com.company.droid.ui.login.LoginTask.doInBackground(LoginTask.java:23)
              at android.os.AsyncTask$2.call(AsyncTask.java:292)
              at java.util.concurrent.FutureTask.run(FutureTask.java:237)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
              at java.lang.Thread.run(Thread.java:818)
      ]}

Answer

Devendra Vaja picture Devendra Vaja · Jul 10, 2015

Recently I faced the issue while working on some legacy code. After googling I found that the issue is everywhere but without any concrete resolution. I worked on various parts of the exception message and analyzed below.

Analysis:

  1. SSLException: exception happened with the SSL (Secure Socket Layer), which is implemented in javax.net.ssl package of the JDK (openJDK/oracleJDK/AndroidSDK)
  2. Read error ssl=# I/O error during system call: Error occured while reading from the Secure socket. It happened while using the native system libraries/driver. Please note that all the platforms solaris, Windows etc. have their own socket libraries which is used by the SSL. Windows uses WINSOCK library.
  3. Connection reset by peer: This message is reported by the system library (Solaris reports ECONNRESET, Windows reports WSAECONNRESET), that the socket used in the data transfer is no longer usable because an existing connection was forcibly closed by the remote host. One needs to create a new secure path between the host and client

Reason:

Understanding the issue, I try finding the reason behind the connection reset and I came up with below reasons:

  • The peer application on the remote host is suddenly stopped, the host is rebooted, the host or remote network interface is disabled, or the remote host uses a hard close.
  • This error may also result if a connection was broken due to keep-alive activity detecting a failure while one or more operations are in progress. Operations that were in progress fail with Network dropped connection on reset(On Windows(WSAENETRESET)) and Subsequent operations fail withConnection reset by peer(On Windows(WSAECONNRESET)).
  • If the target server is protected by Firewall, which is true in most of the cases, the Time to live (TTL) or timeout associated with the port forcibly closes the idle connection at given timeout. this is something of our interest

Resolution:

  1. Events on the server side such as sudden service stop, rebooted, network interface disabled can not be handled by any means.
  2. On the server side, Configure firewall for the given port with the higher Time to Live (TTL) or timeout values such as 3600 secs.
  3. Clients can "try" keeping the network active to avoid or reduce the Connection reset by peer.
  4. Normally on going network traffic keeps the connection alive and problem/exception is not seen frequently. Strong Wifi has least chances of Connection reset by peer.
  5. With the mobile networks 2G, 3G and 4G where the packet data delivery is intermittent and dependent on the mobile network availability, it may not reset the TTL timer on the server side and results into the Connection reset by peer.

Here are the terms suggested to set on various forums to resolve the issue

  • ConnectionTimeout: Used only at the time out making the connection. If host takes time to connection higher value of this makes the client wait for the connection.