Got question about WebSphere and cannot find anything in the documentation...
What happens when WebContainer thread pool is fully used and new request is received? I am talking about the situation when all are used and we reached max thread pool size meaning that new one may not be created to process the request.
Will the request: - fail immediately and the response will contain some kind of error? - WAS will somehow "queue" the request for given period and process it when one of the threads is back in the pool? Some kind of error/timeout may still happen if the wait time is too long? - WAS will "queue" the request indefinitely and the timeout might only happen on the User side (web browser/app)?
The exact behavior is probably undocumented so that the details can be changed between releases in order to improve behavior. You can likely infer the behavior by looking at javacores or by gleaning information from various documents, for example the BoundedBuffer section of this IBM WebSphere Application Server Performance Cookbook document:
The thread pool request buffer is essentially a backlog in front of the thread pool. If the thread pool is at its maximum size and all of the threads are dispatched, then work will queue in the requestBuffer. The maximum size of the requestBuffer is equal to the thread pool maximum size; however, if the unit of work is executed on the thread pool with a blocking mode of EXPAND_WHEN_QUEUE_IS_FULL_ERROR_AT_LIMIT or EXPAND_WHEN_QUEUE_IS_FULL_WAIT_AT_LIMIT, then the maximum size is ThreadPoolMaxSize * 10. When the requestBuffer fills up, then WSVR0629I is issued (although only the first time this happens per JVM run per thread pool). When the requestBuffer is full, work will either wait or throw a ThreadPoolQueueIsFullException, depending on how the unit of work is executed.
In practice, that means after maxThreads threads are busy executing work, an additional maxThreads requests will be queued in a bounded buffer, and when that buffer is full, the socket thread will block until it can queue the work, which means that further incoming requests will be blocked until a thread becomes available and makes space in the bounded buffer.