Java heartbeat design

Tata2 picture Tata2 · Nov 23, 2015 · Viewed 12.5k times · Source

I need to implement a heartbeat system on my Java project (3-5 Clients with 1 server for them) but I have some questions.

1) Do I need to have 2 sockets by clients ? 1 for the heartbeat and 1 to receive normal message for my software

2) I saw that in specific case when a client is lagging, the client don't receive a message, how to avoid this ?

3) In case of a client disconnect, how to retreive the connection with it ? Without recreate a new socket with it.

Answer

Miguel Cunha picture Miguel Cunha · Nov 23, 2015

So, you have a "central server" which needs to provide an heartbeat mechanism to the clients. Well, part of the solution is simple since you have only 1 server, which simplifies a LOT since you don't need to deal with data replication, data synchronization mechanisms, server failure, and so on. You just expect that your server never fails and if it fails it's a fatal error.

My suggestion is to implement a system based on notifications (pooling is bad and ugly): instead of having the server pooling the clients, you have the clients reporting to the server every X seconds of their state. This reduces the general overload of your system and it's based on the design principle of "Tell, don't ask". This also allows you to have different report times for each individual client.

There is one more question, which is what data do you want to transmit? Simply if the client is alive? Runtime data of the client, for example, % of it's job done if the client is downloading a file? Environment status, such as CPU overload, memory usage, network status? Define that, that's the first step.

Talking about the java implementation, you should run your a thread on each of your clients (implementing the Runnable interface). It should look something like this code (this is simplified for the sake of brevity):

public class HeartbeatAgent implements Runnable {

private int DEFAULT_SAMPLING_PERIOD = 5; //seconds
private String DEFAULT_NAME = "HeartbeatAgent";
private HashMap<Integer, Object> values; // <id, value>


public HeartbeatAgent () {
  values = new HashMap<Integer,Object>();

}


private void collect() {
    /** Here you should collect the data you want to send 
        and store it in the hash
    **/

}

public void sendData(){
    /** Here you should send the data to the server. Use REST/SOAP/multicast messages, whatever you want/need/are forced to **/
}

public void run() {
  System.out.println("Running " +  DEFAULT_NAME );
  try {
     while( /** condition you want to stop **/ {
        System.out.println("Thread: " + DEFAULT_NAME + ", " + "I'm alive");

        this.collect();
        this.send();
        // Let the thread sleep for a while.
        Thread.sleep(DEFAULT_SAMPLING_PERIOD * 1000);
     }
 } catch (InterruptedException e) {
     System.out.println("Thread " +  DEFAULT_NAME + " interrupted.");
 }
 System.out.println("Thread " +  DEFAULT_NAME + " exiting.");
}
}

You should write a server that handles the requests made and is "smart" enough to call a time-out after X seconds without "news" from client Y.

This is still not ideal, since you collect data and send it with the same sampling period, but usually you want to collect data at very tiny intervals (collecting CPU usage every 5 seconds, for instance) but only report it every 30 seconds.

If you want to look at good code of a good library that does this (it's what we've been using to our project at my company), take a look at JCatascopia framework code (just look at the Agent and Server folders, ignore the others).

There's a lot to say about this topic, this is the basic. Feel free to ask!