I'm streaming a large document through a Spring MVC controller running on Apache Tomcat/6.0.18
because its large, and will (eventually) be dynamically generated, I decided to use chunked Transfer-Encoding.
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.inject.Inject;
import javax.servlet.http.HttpServletResponse;
import org.apache.commons.httpclient.ChunkedOutputStream;
import org.apache.commons.net.io.CopyStreamException;
import org.apache.commons.net.io.Util;
import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.RequestMapping;
@Controller
public class QueryController {
@Inject
QueryService queryService;
@RequestMapping(value = "/stream")
public void hellostreamer(HttpServletResponse response) throws CopyStreamException, IOException {
response.setHeader("Transfer-Encoding", "chunked");
response.setHeader("Content-type", "text/xml");
InputStream filestream = new FileInputStream("/lotsrecs.xml");
ChunkedOutputStream chunkStream = new ChunkedOutputStream(response.getOutputStream());
Util.copyStream(filestream,chunkStream);
chunkStream.close();
chunkStream.finish();
}
}
However, when I open this in firefox I get this:
XML Parsing Error: syntax error
Location: http://localhost:8082/streaming-mockup-1.0-SNAPSHOT/stream
Line Number 1, Column 1:
800
^
Rather than reading the chunk sizes as metadata about the stream, its reading them as part of the stream!
Using Live HTTP headers, I can see that the Transfer-Encoding header is being received:
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
Content-Type: text/xml
Date: Thu, 11 Aug 2011 18:08:07 GMT
So I'm at a loss for why the chunk sizes are not being interpreted correctly. If I make the request using wget, I also see the chunk size characters inside the returned document, so somehow they're not being encoded correctly. Anyone have an idea why?
Looking at the transmission with wireshark: (note that the "800" recurs throughout the stream) Note that 0x800 = 2048, which is the default chunksize used by the class ChunkedOutputStream.
GET /streaming-mockup-1.0-SNAPSHOT/stream HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: localhost:8082
Connection: Keep-Alive
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
Content-Type: text/xml
Date: Thu, 11 Aug 2011 18:47:24 GMT
Connection: close
800
<records>
<REC>
<FUID>412286284WOS1</FUID>
<UID>WOS:000292284100013</UID>
<static_data>
<summary>
<EWUID uid="WOS:000292284100013" year="2011">
If I just copy to output stream directly without creating a ChunkedOutputStream, I don't see the chunk size at all:
GET /streaming-mockup-1.0-SNAPSHOT/stream HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Host: localhost:8082
Connection: Keep-Alive
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
Content-Type: text/xml
Date: Thu, 11 Aug 2011 18:51:05 GMT
Connection: close
<records>
<REC>
<FUID>412286284WOS1</FUID>
<UID>WOS:000292284100013</UID>
<static_data>
<summary>
So how do I know if this is chunked? If it were, wouldn't I see the chunk sizes?
Are you sure you need to construct a ChunkedOutputStream
for yourself?
My understanding (untainted by practice) is that ServletResponse.getOutputStream()
should handle the chunking for you if appropriate (say, if the client is not HTTP 1.0, and so forth). If that is true, the reply that actually gets sent will be chunked encoding inside chunked encoding, and the browser of course knows about only one of these layers.
Have you tried to run the server somewhere across the network and inspect the transaction with Wireshark?
Update:
GET /streaming-mockup-1.0-SNAPSHOT/stream HTTP/1.0
HTTP/1.0 clients are not required to understand chunked encoding at all (naturally enough, as that encoding was only invented for 1.1).