Timeouts connecting to a Postgres database on Amazon RDS from Azure

Cleber Goncalves picture Cleber Goncalves · Mar 27, 2014 · Viewed 17.1k times · Source

I get the following exception in my application after leaving a database connection idle for some amount of time:

... An I/O error occured while sending to the backend.; nested exception is org.postgresql.util.PSQLException: An I/O error occured while sending to the backend.] with root cause

    java.net.SocketException: Operation timed out
        at java.net.SocketInputStream.socketRead0(Native Method)

The same issue happens in psql AND I don't have issues connecting to a local database, so I'm pretty sure the problem is on RDS.

psql=> select 'ok';
SSL SYSCALL error: Operation timed out
psql=> select 'ok';
SSL SYSCALL error: EOF detected
The connection to the server was lost. Attempting reset: Succeeded.

I found this other question which suggests a work around that improved the situation (timeouts now take a lot longer) but didn't fix it.

I'm using Spring Boot with JDBC (tomcat connection pooling) and JDBCTemplate.

Is there a work around or a fix? Perhaps forcing the connection pool to test and reconnect? How do I do that in this environment?

EDIT: This is my connection string

jdbc:postgresql://myhost.c2estvxozjm3.eu-west-1.rds.amazonaws.com/dashboard?tcpKeepAlive=true

SOLUTION:

Edited the RDS server side TCP_KeepAlive parameters as suggested in the selected answer. The parameters I'm using are:

tcp_keepalives_count      5
tcp_keepalives_idle     200
tcp_keepalives_interval 200

Answer

Craig Ringer picture Craig Ringer · Mar 28, 2014

It looks like something - maybe a NAT router on your end, maybe something on AWS's end - is connection tracking, and is forgetting about connections after a while.

I suggest enabling TCP keepalives. You might be able to enable them server side in the AWS RDS configuration; if not, you can request them client-side in the JDBC driver.

TCP keepalives are a lot better than a validation/test query, because they're much lower overhead, and they don't result in unnecessary log spam in the server query logs.