How can timeout be increased so that till response is processed, request does not timeout?
Tomcat settings in Spring Boot:
server.tomcat.max-connections=2000
server.tomcat.max-threads=200
server.connection-timeout=1200000
Request per second were raised constantUsersPerSec(20) during (15)
to 300 during course of 15 seconds and all requests were served as can be seen in plot below from gatling(blue).
scn.inject(
constantUsersPerSec(20) during (15),
)
This is due to max-connections = 2000
which served 300 requests using 200
worker threads.
Controller is written in Spring MVC which returns DeferredResult
which does asynchronous request processing and therefore will resume response once response is processed.
But even though server.connection-timeout
is set to high number 1200000
there are lot of 503 towards end (red)
> status.find.in(200,304,201,202,203,204,205,206,207,208,209), b 78 (100.0%)
ut actually found 503
Gatling.conf is also set for increased timeout:
timeOut {
simulation = 8640000 # Absolute timeout, in seconds, of a simulation
}
ahc {
#keepAlive = true # Allow pooling HTTP connections (keep-alive header automatically added)
connectTimeout = 600000 # Timeout when establishing a connection
handshakeTimeout = 600000 # Timeout when performing TLS hashshake
pooledConnectionIdleTimeout = 600000 # Timeout when a connection stays unused in the pool
readTimeout = 600000 # Timeout when a used connection stays idle
#maxRetry = 2 # Number of times that a request should be tried again
requestTimeout = 600000
As per comments from Rcordoval -
Check this property: spring.mvc.async.request-timeout= # Amount of time before asynchronous request handling times out
This setting helps with rest of gatling configurations
spring.mvc.async.request-timeout=1200000
The root cause however is that when requests come in large numbers, then all worker threads (200) get occupied in uploading open connections (2000) (the Controller is taking MultipartFile as argument and returning a DeferredResult )
I think DeferredResult
shines when request serve logic is fast and business logic is slow (runs on forkjoin.commonPool). It does not quite fits with MultiPartFile uploads (blocking and slow) and more so when File sizes are big since then responses are not quickly resumed (as can be seen in above response per sec chart, only after few seconds responses start resuming since open connections are 2000 and workers only 200). If workers are increased, it then mitigates advantages of async processing anyway.
In this case, request processing (upload and blocking) was slow and business logic was fast. so responses were getting ready but all the worker threads (200) are so busy serving more and more requests that responses are not getting resumed and timing out as a result.
Probably makes a case for having separate pool for request serve
and response resume
in async processing with DeferredResult ?