I have a Python REST service and I want to serve it using HTTP2. My current server setup is nginx -> Gunicorn
. In other words, nginx (port 443 and 80 that redirects to port 443) is running as a reverse proxy and forwards requests to Gunicorn (port 8000, no SSL). nginx is running in HTTP2 mode and I can verify that by using chrome and inspecting the 'protocol' column after sending a simple GET to the server. However, Gunicorn reports that the requests it receives are HTTP1.0. Also, I coulnt't find it in this list:
https://github.com/http2/http2-spec/wiki/Implementations
So, my questions are:
The reason I want to use HTTP2 is because in some cases I need to perform thousands of requests all together and I was interested to see if the multiplexed requests feature of HTTP2 can speed things up. With HTTP1.0 and Python Requests as the client, each request takes ~80ms which is unacceptable. The other solution would be to just bulk/batch my REST resources and send multiple with a single requests. Yes, this idea sounds just fine, but I am really interested to see if HTTP2 could speed things up.
Finally, I should mention that for the client side I use Python Requests with the Hyper http2 adapter.
Is it possible to serve a Python (Flask) application with HTTP/2?
Yes, by the information you provide, you are doing it just fine.
In my case (one reverse proxy server and one serving the actual API), which server has to support HTTP2?
Now I'm going to tread on thin ice and give opinions.
The way HTTP/2 has been deployed so far is by having an edge server that talks HTTP/2 (like ShimmerCat or NginX). That server terminates TLS and HTTP/2, and from there on uses HTTP/1, HTTP/1.1 or FastCGI to talk to the inner application.
Can, at least theoretically, an edge server talk HTTP/2 to web application? Yes, but HTTP/2 is complex and for inner applications, it doesn't pay off very well.
That's because most web application frameworks are built for handling requests for content, and that's done well enough with HTTP/1 or FastCGI. Although there are exceptions, web applications have little use for the subtleties of HTTP/2: multiplexing, prioritization, all the myriad of security precautions, and so on.
The resulting separation of concerns is in my opinion a good thing.
Your 80 ms response time may have little to do with the HTTP protocol you are using, but if those 80 ms are mostly spent waiting for input/output, then of course running things in parallel is a good thing.
Gunicorn will use a thread or a process to handle each request (unless you have gone the extra-mile to configure the greenlets backend), so consider if letting Gunicorn spawn thousands of tasks is viable in your case.
If the content of your requests allow it, maybe you can create temporary files and serve them with an HTTP/2 edge server.