How does HTTP Adaptive Bitrate Streaming work on the iPhone?

Adam Davis picture Adam Davis · Jul 1, 2009 · Viewed 31.3k times · Source

Apple has included HTTP Adaptive Bitrate Streaming in the iPhone OS 3.0, in particular Safari handles this automatically.

I'd like to play with this in a low cost manner, but I expect it'll require a custom HTTP server in the worst case, and interesting PHP/etc scripting in the best case.

But first I need to know what the protocol differences or standard is. HTTP is reasonably simple as a protocol, but adaptive bitrate means the file size is different, the chunk locations are different at different bitrates, etc. For instance, does the client tell the server anything special about the stream as it's downloading, or is it all handled on the server side?

Eliminating buffering pauses for the end user is very attractive for both live and pre-recorded video streams, and doing both over HTTP is even better given many networks and governments are limiting non port 80 traffic.

  • What are the technical details for HTTP adaptive bitrate streaming, especially Apple's implementation?
  • Where is this best implemented - part of the HTTP server itself, part of a mod, in a script...?

  • What changes are required for the client side, if one were to implement this in an application?

Answer

Adam Davis picture Adam Davis · Jul 1, 2009

Update

Looks like Apple made an IETF draft proposal, and some people are already working on segmenters:

HTTP Live Streaming - draft-pantos-http-live-streaming-01
http://tools.ietf.org/id/draft-pantos-http-live-streaming-01.txt

iPhone HTTP Streaming with FFMpeg and an Open Source Segmenter
http://www.ioncannon.net/programming/452/iphone-http-streaming-with-ffmpeg-and-an-open-source-segmenter/


Looks like the HTTP server acts simply as a dumb HTTP server. Poking around the example website provided by Akamai gives me enough info to get started with static content streaming.

http://iphone.akamai.com/

The whitepaper ( http://www.akamai.com/dl/akamai/iphone_wp.pdf ) provides information about the transport stream encoding, so the .ts streams are straightforward.

The encoder (or a separate segmenter process) will produce H.264/AAC content in a sequence of small content segments, in MPEG-2 TS format (.ts). There is also an M3U8 index file that references the segments; in the case of live content the M3U8 is continuously updated to reflect the latest content.

H.264 Encoding should be single-pass Baseline Profile, frame re-ordering disabled. Key frames are suggested every 5 seconds, ideally an even divisor of the chosen segment length.

The website provides an M3U8 file, which is simply an M3U playlist, but in the UTF-8 character encoding format.

That file then links to an M3U8 file for each bitrate. I assume they must all have cuts at the same positions (every 2 or 10 seconds, for instance) so that switching can be seamless. It appears to be completely client driven - the client decides how to measure bandwidth and which version it's going to get.

The contents of the main file are:

#EXTM3U
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=860000
hi/prog_index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=512000
med/prog_index.m3u8
#EXT-X-STREAM-INF:PROGRAM-ID=1, BANDWIDTH=160000
lo/prog_index.m3u8

Then each of the other files are:

hi/prog_index.m3u8

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10, 
fileSequence0.ts
#EXTINF:10, 
fileSequence1.ts
#EXTINF:10, 
fileSequence2.ts
#EXTINF:10, 
fileSequence3.ts
#EXTINF:1,  
fileSequence4.ts
#EXT-X-ENDLIST

med/prog_index.m3u8

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10, 
fileSequence0.ts
#EXTINF:10, 
fileSequence1.ts
#EXTINF:10, 
fileSequence2.ts
#EXTINF:10, 
fileSequence3.ts
#EXTINF:1,  
fileSequence4.ts
#EXT-X-ENDLIST

lo/prog_index.m3u8

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10, 
fileSequence0.ts
#EXTINF:10, 
fileSequence1.ts
#EXTINF:10, 
fileSequence2.ts
#EXTINF:10, 
fileSequence3.ts
#EXTINF:1,  
fileSequence4.ts
#EXT-X-ENDLIST

This works with the HTML 5 video tag:

<video width="640" height="480">
   <source src="content1/content1.m3u8" />
</video>

There are still a lot of unanswered questions, but this is probably enough to get started.