Redis IOException: "Existing connection forcibly closed by remote host" using ServiceStack C# client

Bernardo picture Bernardo · Oct 29, 2012 · Viewed 8.6k times · Source

We have the following setup:

Redis 2.6 on Ubuntu Linux 12.04LTE on a RackspaceCloud 8GB instance with the following settings:

daemonize yes
pidfile /var/run/redis_6379.pid

port 6379

timeout 300

loglevel notice
logfile /var/log/redis_6379.log

databases 16

save 900 1
save 300 10
save 60 10000

rdbcompression yes
dbfilename dump.rdb
dir /var/redis/6379

requirepass PASSWORD

maxclients 10000

maxmemory 7gb
maxmemory-policy allkeys-lru
maxmemory-samples 3

appendonly no

slowlog-log-slower-than 10000
slowlog-max-len 128

activerehashing yes

Our App servers are hosted in RackSpace Managed and connect to the Redis via public IP (to avoid having to set up RackSpace Connect, which is a royal PITA), and we provide some security by requiring a password for the Redis connection. I manually increased unix file descriptor limits to 10240, max of 10k connections should offer enough headroom. As you can see from the settings file above, I limit memory usage to 7GB to leave some RAM headroom as well.

We use the ServiceStack C# Redis Driver. We use the following web.config settings:

<RedisConfig suffix="">
  <Primary password="PASSWORD" host="HOST" port="6379"  maxReadPoolSize="50" maxWritePoolSize="50"/>
</RedisConfig>  

We have a PooledRedisClientManager singleton, created once per AppPool as follows:

private static PooledRedisClientManager _clientManager;
public static PooledRedisClientManager ClientManager
{
    get
    {
        if (_clientManager == null)
        {
            try
            {
                var poolConfig = new RedisClientManagerConfig
                {
                    MaxReadPoolSize = RedisConfig.Config.Primary.MaxReadPoolSize,
                    MaxWritePoolSize = RedisConfig.Config.Primary.MaxWritePoolSize,
                };

                _clientManager = new PooledRedisClientManager(new List<string>() { RedisConfig.Config.Primary.ToHost() }, null, poolConfig);
            }
            catch (Exception e)
            {
                log.Fatal("Could not spin up Redis", e);
                CacheFailed = DateTime.Now;
            }
        }
        return _clientManager;
    }
}

And we acquire a connection and do put/get operations as follows:

    using (var client = ClientManager.GetClient())
    {
        client.Set<T>(region + key, value);
    }

Code seems to mostly work. Given that we have ~20 AppPools and 50-100 read and 50-100 write clients we expect 2000-4000 connections to the Redis server at the most. However, we keep seeing the following exception in our error logs, usually a couple hundred of those bunched together, nothing for an hour, and over again, ad nauseum.

System.IO.IOException: Unable to read data from the transport connection:
An existing connection was forcibly closed by the remote host.
---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at
System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at
System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace
- at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.IO.BufferedStream.ReadByte() at
ServiceStack.Redis.RedisNativeClient.ReadLine() in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 85 at
ServiceStack.Redis.RedisNativeClient.SendExpectData(Byte[][] cmdWithBinaryArgs) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 355 at
ServiceStack.Redis.RedisNativeClient.GetBytes(String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient.cs:line 404 at ServiceStack.Redis.RedisClient.GetValue(String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisClient.cs:line 185 at ServiceStack.Redis.RedisClient.Get[T](String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisClient.ICacheClient.cs:line 32 at DataPeaks.NoSQL.RedisCacheClient.Get[T](String key) in c:\dev\base\branches\currentversion\DataPeaks\DataPeaks.NoSQL\RedisCacheClient.cs:line 96

We have experimented with a Redis Server Timeout of 0 (i.e. NO connection timeout), a timeout of 24 hours, and in between, without luck. Googling and Stackoverflowing has brought no real answers, everything seems to point to us doing the right thing with the code at least.

Our feeling is that we get regular sustained network latency issues beetwen Rackspace Hosted and Rackspace Cloud, which cause a block of TCP connections to go stale. We could possibly solve that by implementing Client side connection timeouts, and the question would be whether we'd need server side timeouts as well. But that's just a feeling, and we're not 100% sure we're on the right track.

Ideas?

Edit: I occasionally see the following error as well:

ServiceStack.Redis.RedisException: Unable to Connect: sPort: 65025 ---> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.Send(IList`1 buffers, SocketFlags socketFlags) at ServiceStack.Redis.RedisNativeClient.FlushSendBuffer() in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 273 at ServiceStack.Redis.RedisNativeClient.SendCommand(Byte[][] cmdWithBinaryArgs) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 203 --- End of inner exception stack trace --- at ServiceStack.Redis.RedisNativeClient.CreateConnectionError() in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 165 at ServiceStack.Redis.RedisNativeClient.SendExpectData(Byte[][] cmdWithBinaryArgs) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient_Utils.cs:line 355 at ServiceStack.Redis.RedisNativeClient.GetBytes(String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisNativeClient.cs:line 404 at ServiceStack.Redis.RedisClient.GetValue(String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisClient.cs:line 185 at ServiceStack.Redis.RedisClient.Get[T](String key) in C:\src\ServiceStack.Redis\src\ServiceStack.Redis\RedisClient.ICacheClient.cs:line 32 at DataPeaks.NoSQL.RedisCacheClient.Get[T](String key) in c:\dev\base\branches\currentversion\DataPeaks\DataPeaks.NoSQL\RedisCacheClient.cs:line 96

I imagine this is a direct result of having server-side connection timeouts that aren't handled on the client. It's looking like we really need to be handling client-side connection timeouts.

Answer

Bernardo picture Bernardo · Nov 20, 2013

We think we found the root cause after carefully reading through the Redis documentation and finding this beauty (http://redis.io/topics/persistence):

RDB needs to fork() often in order to persist on disk using a child process.
Fork() can be time consuming if the dataset is big, and may result in Redis
to stop serving clients for some millisecond or even for one second if the
dataset is very big and the CPU performance not great. AOF also needs to fork()
but you can tune how often you want to rewrite your logs without any trade-off
on durability.

We turned RDB persistence off, and haven't seen those connection drops since.