We are coming across an interesting issue. Here is what our setup looks like:
Initially we were using SignalR 0.5.3 - when we started observing that the windows service's connection to the signal R server drops out. The frequency of this ranges from every few minutes to every few hours. It reconnects in most cases, but fails to reconnect occassionally, resulting in the windows service losing its connection once every couple of days. But there is no set pattern to it. It is not related to server restarting/backups, etc. We added logging to the windows service to monitor the StateChanged event on the client connection and found that the event gets fired when it disconnects and reconnects, but not when it does not reconnect.
Then we came across this thread: client constantly reconnecting
and decided to upgrade everything to SignalR 1.0.1 (we had to do it anyway at some point). The windows service was also upgraded to framework 4.5 (from Framework 2.0) now referencing the new Microsoft.AspNet.SignalR.Client.dll. This also allowed us (using a newly added connection property) to determine that the windows service was in fact using the ServerSentEvents protocol. Installing the same windows service on a Windows Server 2012 machine uses the WebSockets protocol. This is in line with this thread: SignalR .NET Client doesn't support WebSockets on Windows 7
However, the behaviour of the service on the Windows Server 2008 R2 server did not change. It still disconnects and reconnects, and loses its connection once in a while. Due to a few limitations, we cannot use the windows server 2012 for the windows service and are stuck with older OSs. This isn't to say that windows service using the websockets protocol would solve all our problems (we haven't tested that thoroughly).
The third thing we tried is to get the source code from GitHub and compile it and upgrade the services (SignalR Server, and the clients) - this was done to ensure that we get the latest copy with any potential bug fixes.
But it did not help. We are now at a point where we feel we have exhausted our options. Suggestions would be greatly appreciated. Thanks.
=====================================
EDIT: MORE INFORMATION:
Okay, now we have some more information. We added some code into the windows service (SignalR Client) to log into the SignalR Server every 30 minutes (for testing the connection).
Here is what happens on the client side every 30 minutes:
WriteEvent(Now(), "INFO", "PING", "Performing logon procedure with SiteCode = " & msSiteCode & ".")
trans.Invoke("login", New String() {msSiteCode, "", "SERVER", "", ""})
where trans is the instance of the server-side class inheriting from Hub, and WriteEvent is basically a trace to write to a log file.
and the client side also has a 'isLoggedIn' method as follows:
Private Sub isLoggedIn(ByVal bLoggedIn As String)
If bLoggedIn Then
WriteEvent(Now(), "INFO", "", "SignalR Server: Authenticated")
Else
WriteEvent(Now(), "ERROR", "", "SignalR Server: Authentication failed")
End If
End Sub
On the server side we have the login method:
Public Sub login(ByVal sAccount As String, _
ByVal sCompanyCode As String, _
ByVal sClientId As String, _
ByVal sPassword As String, _
ByVal sModuleCode As String)
Try
'Some code omitted that validates the user and sets bValidated.
If bValidated Then
'Update user in cache
ConnectionCache.Instance.UpdateCache(userId, Context.ConnectionId, UserCredential.Connection_Status.Connected)
Clients.Caller.isLoggedIn(True)
Dim connectionId As String = ConnectionCache.Instance.FindConnectionId(userId)
LogEvent("Successful login for connectionid: " & connectionId & ". Context. User: " & userId, _
EventLogEntryType.Information)
Else
Clients.Caller.isLoggedIn(False, results)
End If
Catch ex As Exception
LogEvent("Login: " & ex.Message, EventLogEntryType.Error)
End Try
End Sub
If we look at the client log file, every 30 minutes we get the following log entries:
So we know that the login server-side method is being called, and the isLoggedIn client-side method is also called.
However, at some point, while the server-side method is called, the isLoggedIn client-side method does not get called. So every 30 minutes, we start getting just one entry:
In addition, the log event:
LogEvent("Successful login for connectionid: " & connectionId & ". Context. User: " & userId, EventLogEntryType.Information)
in the server-side login method gets written on the server-side log. So Clients.Caller.isLoggedIn(True) gets called as expected, but we don't see that on the client-side.
So I guess what we are looking at is that the client is always able to access the server and is able to call the server side (login) function, but the server fails calling the client side (isLoggedIn) function, and this starts happening at some point.
Also, this could be something specific to .NET clients, as I am pretty sure we have not seen this happening with our HTML5/javascript clients.
In the end, we just created a simple "PINGING" function. This gets called every 15 minutes. The logic is as follows:
So while we gave up on trying to figure out what the cause was, we have a workaround to manage loss of "server-to-client" connection when it happens. Note that this is in addition to the in-built re-connection logic in signalR.
We also maintain logs and on average this happens (the client does not get a PING back from the server) maybe once a day.