Using http.client
in Python 3.3+ (or any other builtin python HTTP client library), how can I read a chunked HTTP response exactly one HTTP chunk at a time?
I'm extending an existing test fixture (written in python using http.client
) for a server which writes its response using HTTP's chunked transfer encoding. For the sake of simplicity, let's say that I'd like to be able to print a message whenever an HTTP chunk is received by the client.
My code follows a fairly standard pattern for reading a large response:
conn = http.client.HTTPConnection(...)
conn.request(...)
response = conn.getresponse()
resbody = []
while True:
chunk = response.read(1024)
if len(chunk):
resbody.append(chunk)
else:
break
conn.close();
But this reads 1024 byte chunks regardless of whether or not the server is sending 10 byte chunks or 10MiB chunks.
What I'm looking for would be something like the following:
while True:
chunk = response.readchunk()
if len(chunk):
resbody.append(chunk)
else
break
If this is not possible with http.client
, is it possible with another builtin http client library? If it's not possible with a builtin client lib, is it possible with pip
installable module?
Update:
The benefit of chunked transfer encoding is to allow the transmission of dynamically generated content. Whether a HTTP library lets you read individual chunks or not is a separate issue (see RFC 2616 - Section 3.6.1).
I can see how what you are trying to do would be useful, but the standard python http client libraries don't do what you want without some hackery (see http.client and httplib).
What you are trying to do may be fine for use in your test fixture, but in the wild there are no guarantees. It is possible for the chunking of the data read by your client to be be different from the chunking of the data sent by your server. E.g. the data could have been "re-chunked" by a proxy server before it arrived (see RFC 2616 - Section 3.2 - Framing Techniques).
The trick is to tell the response object that it isn't chunked (resp.chunked = False
) so that it returns the raw bytes. This allows you to parse the size and data of each chunk as it is returned.
import http.client
conn = http.client.HTTPConnection("localhost")
conn.request('GET', "/")
resp = conn.getresponse()
resp.chunked = False
def get_chunk_size():
size_str = resp.read(2)
while size_str[-2:] != b"\r\n":
size_str += resp.read(1)
return int(size_str[:-2], 16)
def get_chunk_data(chunk_size):
data = resp.read(chunk_size)
resp.read(2)
return data
respbody = ""
while True:
chunk_size = get_chunk_size()
if (chunk_size == 0):
break
else:
chunk_data = get_chunk_data(chunk_size)
print("Chunk Received: " + chunk_data.decode())
respbody += chunk_data.decode()
conn.close()
print(respbody)