HTML5 & Web audio api: Streaming microphone data from browser to server. Ideal transports and data compression

IyadAssaf picture IyadAssaf · Dec 23, 2013 · Viewed 8.7k times · Source

I am looking to take the audio input from the browser and stream it to multiple listeners. The intended use is for music, so the quality must mp3 standard or thereabouts.

I have attempted two ways, both yielding unsuccessful results:

WebRTC

  • Streaming audio directly between browsers works fine, but the audio quality seems to be non-customisable though what I have seen. (I have seen that it is using the Opus audio codec, but seems to not expose any controls).
  • Does anyone have any insight into how to increase the audio quality in WebRTC streams?

Websockets

  • The issue is the transportation from the browser to the server. The PCM audio data I can acquiring via the method below has proven too large to repeatedly stream to the server via websockets. The stream works perfectly in high speed internet environments, but on slower wifi it is un-usable.

    var context = new webkitAudioContext()
    navigator.webkitGetUserMedia({audio:true}, gotStream)
    
    function gotStream (stream)
    {
        var source = context.createMediaStreamSource(stream)
        var proc = context.createScriptProcessor(2048, 2, 2)
    
        source.connect(proc)
        proc.connect(context.destination)
        proc.onaudioprocess = function(event)
        {
            var audio_data = event.inputBuffer.getChannelData(0)|| new Float32Array(2048)
            console.log(audio_data)
            // send audio_data to server
        }
    }
    

So the main question is, is there any way to compress the PCM data in order to make it easier to stream to the server? Or perhaps there is an easier way to go about this?

Answer

cwilso picture cwilso · Dec 23, 2013

There are lots of ways to compress PCM data, sure, but realistically, your best bet is to get WebRTC to work properly. WebRTC is designed to do this - adaptively stream media - although you don't define what you mean by "multiple" listeners (there's a huge difference between 3 listeners and 300,000 simultaneous listeners).