Apple has included HTTP Adaptive Bitrate Streaming in the iPhone OS 3.0, in particular Safari handles this automatically.
I'd like to play with this in a low cost manner, but I expect it'll require a custom HTTP server in the worst case, and interesting PHP/etc scripting in the best case.
But first I need to know what the protocol differences or standard is. HTTP is reasonably simple as a protocol, but adaptive bitrate means the file size is different, the chunk locations are different at different bitrates, etc. For instance, does the client tell the server anything special about the stream as it's downloading, or is it all handled on the server side?
Eliminating buffering pauses for the end user is very attractive for both live and pre-recorded video streams, and doing both over HTTP is even better given many networks and governments are limiting non port 80 traffic.
Where is this best implemented - part of the HTTP server itself, part of a mod, in a script...?
What changes are required for the client side, if one were to implement this in an application?
Looks like Apple made an IETF draft proposal, and some people are already working on segmenters:
HTTP Live Streaming - draft-pantos-http-live-streaming-01
http://tools.ietf.org/id/draft-pantos-http-live-streaming-01.txt
iPhone HTTP Streaming with FFMpeg and an Open Source Segmenter
http://www.ioncannon.net/programming/452/iphone-http-streaming-with-ffmpeg-and-an-open-source-segmenter/
Looks like the HTTP server acts simply as a dumb HTTP server. Poking around the example website provided by Akamai gives me enough info to get started with static content streaming.
The whitepaper ( http://www.akamai.com/dl/akamai/iphone_wp.pdf ) provides information about the transport stream encoding, so the .ts streams are straightforward.
The encoder (or a separate segmenter process) will produce H.264/AAC content in a sequence of small content segments, in MPEG-2 TS format (.ts). There is also an M3U8 index file that references the segments; in the case of live content the M3U8 is continuously updated to reflect the latest content.
H.264 Encoding should be single-pass Baseline Profile, frame re-ordering disabled. Key frames are suggested every 5 seconds, ideally an even divisor of the chosen segment length.
The website provides an M3U8 file, which is simply an M3U playlist, but in the UTF-8 character encoding format.
That file then links to an M3U8 file for each bitrate. I assume they must all have cuts at the same positions (every 2 or 10 seconds, for instance) so that switching can be seamless. It appears to be completely client driven - the client decides how to measure bandwidth and which version it's going to get.
The contents of the main file are:
#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=860000
hi/prog_index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=512000
med/prog_index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=160000
lo/prog_index.m3u8
Then each of the other files are:
hi/prog_index.m3u8
#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10,
fileSequence0.ts
#EXTINF:10,
fileSequence1.ts
#EXTINF:10,
fileSequence2.ts
#EXTINF:10,
fileSequence3.ts
#EXTINF:1,
fileSequence4.ts
#EXT-X-ENDLIST
med/prog_index.m3u8
#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10,
fileSequence0.ts
#EXTINF:10,
fileSequence1.ts
#EXTINF:10,
fileSequence2.ts
#EXTINF:10,
fileSequence3.ts
#EXTINF:1,
fileSequence4.ts
#EXT-X-ENDLIST
lo/prog_index.m3u8
#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10,
fileSequence0.ts
#EXTINF:10,
fileSequence1.ts
#EXTINF:10,
fileSequence2.ts
#EXTINF:10,
fileSequence3.ts
#EXTINF:1,
fileSequence4.ts
#EXT-X-ENDLIST
This works with the HTML 5 video tag:
<video width="640" height="480">
<source src="content1/content1.m3u8" />
</video>
There are still a lot of unanswered questions, but this is probably enough to get started.