General network error after a night of inactivity

Vilx- picture Vilx- · Jul 9, 2010 · Viewed 14.2k times · Source

For some time now our flagship application has been having mysterious errors. The error message is the generic

[DBNETLIB][ConnectionWrite (send()).]General network error. Check your network documentation.

This is reliably reproduced by leaving the app open for the night and resuming work in the morning. Since it's a backend server app this is a normal scenario.

The funny thing is - we've migrated from SQL Server 7 to 2000 to 2008 and the issue is present on all of them. But what seems to matter is the OS on which we run the app. On WinXP it works fine, on Vista/7 it fails. So the problem is at the client end.

The results of Google on the error message cover a very wide spectrum of different causes (since this is a very generic error) and none of the scenarios found there are similar to ours.

So perhaps someone around here will know what the problem is in our case?

Answer

Chris Bednarski picture Chris Bednarski · Jul 9, 2010

You should be able to reproduce this error condition on demand by:
1. Opening a database connection (in your client application)
2. Unplugging the network cable
3. Plugging network cable back in (wait until the network connection is restored)
4. Using the previously opened connection to query the database

As far as I can tell from experience, client side ADO code is not able to consistently determine if an underlying network connection is actually valid or not. Checking if the database connection is open (in the client code) returns true. However, performing any operations on that connection results in a General network error.

The connection pool appears to be able to determine when a connection goes 'bad' so it never returns a bad connection to the application. It simply opens a new connection instead.

So, if a database connection is kept alive for a long time (used or unused) by the application, the underlying TCP/IP connectivity can get broken.

The bottom line is that database connections should be closed and returned back to the connection pool when not in use.

Edit

Also, depending on the number of clients connecting to the db, not using the connection pool can cause another issue. You may hit the maximum number of sockets open on the server side. This is from memory. Once a connection is closed on the client side, the connection on the server goes into a TIME_WAIT state. By default, the server socket takes about 4 minutes to close, so it is not available to other clients during that time. The bottom line is that there is a limited number of available sockets on the server. Keeping too many connections open can create a problem.

One project I worked on easily hit this socket limit with around 120 users. A new 'feature' was added that absolutely hammered the server, and after a few hours of using the app, things would suddenly slow to a crawl for everyone. SQL server was not closing enough sockets in time for new connection requests. Although there are 65K sockets altogether, only the first 5000 are made available to the ADO (this is a default registry setting thing, so can be changed).

The number of sockets in TIME_WAIT state would slowly build up until the OS would not allocate any more. So clients had to wait until server side sockets closed and a new connection could then be created.