What is the best technology solution (framework/approach) to have a Request Queue in front of a REST service. so that i can increase the no of instances of REST service for higher availability and by placing Request queue in front to form a service/transaction boundary for the service client.
There's a couple of issues here, depending on your goals.
First, it only promotes availability of the resources on the back end. Consider if you have 5 servers handling queue requests on the back end. If one of those servers goes down, then the queued request should fall back in to the queue, and be redelivered to one of the remaining 4 servers.
However, while those back end servers are processing, the front end servers are holding on to the actual, initiating requests. If one of those front end servers fails, then those connections are lost completely, and it will be up to the original client to resubmit the request.
The premise perhaps is that simpler front end systems are at a lower risk for failure, and that's certainly true for software related failure. But networks cards, power supplies, hard drives, etc. are pretty agnostic to such false hopes of man and punish all equally. So, consider this when talking about overall availability.
As to design, the back end is a simple process waiting upon a JMS message queue, and processing each message as they come. There are a multitude of examples of this available, and any JMS server will suit at a high level. All you need is to ensure that the message handling is transactional so that if a message processing fails, the message remains in the queue and can be redelivered to another message handler.
Your JMS queue's primary requirement is being clusterable. The JMS server itself is a single point of failure in the system. Lost the JMS server, and your system is pretty much dead in the water, so you'll need to be able to cluster the server and have the consumers and producers handle failover appropriately. Again, this is JMS server specific, most do it, but it's pretty routine in the JMS world.
The front end is where things get a little trickier, since the front end servers are the bridge from the synchronous world of the REST request to the asynchronous world of the back end processors. A REST request follows a typically RPC pattern of consuming the request payload from the socket, holding the connection open, processing the results, and delivering the results back down the originating socket.
To manifest this hand off, you should take a look at the Asynchronous Servlet handling the Servlet 3.0 introduced, and is available in Tomcat 7, the latest Jetty (not sure what version), Glassfish 3.x, and others.
In this case what you would do is when the request arrives, you convert the nominally synchronous Servlet call in to an Asynchronous call using HttpServletRequest.startAsync(HttpServletRequest request, HttpServletResponse response)
.
This returns an AsynchronousContext, and once started, allows the server to free up the processing thread. You then do several things.
At this point, the initial processing is done, and you simply return from doGet (or service, or whatever). Since you have not called AsyncContext.complete(), the server will not close out the connection to the server. Since you have the AsyncContext store in the map by the ID, it's handy for safe keeping for the time being.
Now, when you submitted the request to the JMS queue, it contained: the ID of the request (that you generated), any parameters for the request, and the identification of the actual server making the request. This last bit is important as the results of the processing needs to return to its origin. The origin is identified by the request ID and the server ID.
When your front end server started up, it also started a thread who's job it is to listen to a JMS response queue. When it sets up its JMS connection, it can set up a filter such as "Give me only messages for a ServerID of ABC123". Or, you could create a unique queue for each front end server and the back end server uses the server ID to determine the queue to return the reply to.
When the back end processors consume the message, they're take the request ID, and parameters, perform the work, and then take the result and put them on to the JMS response Queue. When it puts it the result back, it'll add the originating ServerID and the original Request ID as properties of the message.
So, if you got the request originally for Front End Server ABC123, the back end processor will address the results back to that server. Then, that listener thread will be notified when it gets a message. The listener threads task is to take that message and put it on to an internal queue within the front end server.
This internal queue is backed by a thread pool who's job is to send the request payloads back to the original connection. It does this by extracting the original request ID from the message, looking up the AsyncContext from that internal map discussed earlier, and then sending results down to the HttpServletResponse associated with the AsyncContext. At the end, it call AsyncContext.complete() (or a similar method) to tell the server that you're done and to allow it to release the connection.
For housekeeping, you should have another thread on the front end server who's job it is to detect when requests have been waiting in the map for too long. Part of the original message should have been a time the request started. This thread can wake up every second, scan the map for requests, and for any that have been there too long (say 30 seconds), it can put the request on to another internal queue, consumed by a collection of handlers designed to inform the client that the request timed out.
You want these internal queues so that the main processing logic isn't stuck waiting on the client to consume the data. It could be a slow connection or something, so you don't want to block all of the other pending requests to handle them one by one.
Finally, you'll need to account that you may well get a message from the response queue for a request that no longer exists in your internal map. For one, the request may have timed out, so it should not be there any longer. For another, that front end server may have stopped and been restarted, so it internal map of pending request will simply be empty. At this point, if you detect you have a reply for a request that no longer exists, you should simply discard it (well, log it, then discard it).
You can't reuse these requests, there's not such thing really as a load balancer going back to the client. If the client is allowing you to make callbacks via published end points, then, sure you can just have another JMS message handler make those requests. But that's not a REST kind of thing, REST at this level of discussion is more client/server/RPC.
As to which framework support Asynchronous Servlets at a higher level than a raw Servlet, (such as Jersey for JAX-RS or something like that), I can't say. I don't know what frameworks are supporting it at that level. Seems like this is a feature of Jersey 2.0, which is not out yet. There well may be others, you'll have to look around. Also, don't fixate on Servlet 3.0. Servlet 3.0 is simply a standardization of techniques used in individual containers for some time (Jetty notably), so you may want to look at container specific options outside of just Servlet 3.0.
But the concepts are the same. The big takeaway are the response queue listener with the filtered JMS connection, the internal request map to the AsyncContext, and the internal queues and thread pools to do the actual work within the application.