converting .wav file to .ogg in javascript

himura picture himura · Jul 8, 2013 · Viewed 10.6k times · Source

I'm trying to capture user's audio input from the browser. I have done it with WAV but the files are really big. A friend of mine told me that OGG files are much smaller. Does anyone knows how to convert WAV to OGG? I also have the raw data buffer, I don't really need to convert. But I just need the OGG encoder.

Here's the WAV encoder from Matt Diamond's RecorderJS:

function encodeWAV(samples){
  var buffer = new ArrayBuffer(44 + samples.length * 2);
  var view = new DataView(buffer);

  /* RIFF identifier */
  writeString(view, 0, 'RIFF');
  /* file length */
  view.setUint32(4, 32 + samples.length * 2, true);
  /* RIFF type */
  writeString(view, 8, 'WAVE');
  /* format chunk identifier */
  writeString(view, 12, 'fmt ');
  /* format chunk length */
  view.setUint32(16, 16, true);
  /* sample format (raw) */
  view.setUint16(20, 1, true);
  /* channel count */
  view.setUint16(22, 2, true);
  /* sample rate */
  view.setUint32(24, sampleRate, true);
  /* byte rate (sample rate * block align) */
  view.setUint32(28, sampleRate * 4, true);
  /* block align (channel count * bytes per sample) */
  view.setUint16(32, 4, true);
  /* bits per sample */
  view.setUint16(34, 16, true);
  /* data chunk identifier */
  writeString(view, 36, 'data');
  /* data chunk length */
  view.setUint32(40, samples.length * 2, true);

  floatTo16BitPCM(view, 44, samples);

  return view;
}

is there one for OGG?

Answer

jtrick picture jtrick · Jul 16, 2013

To those who down-voted this post: It's really not productive to down-vote questions without taking the time to offer some kind of insight into why the question is somehow 'bad'. I think this question has merit, and the poster clearly has spent some time trying to solve the issue on their own. The Web Audio spec is actually intended to allow exactly this kind of functionality, but is just not close to fulfilling that purpose yet:

This specification describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.

Here's a statement on the current w3c audio spec draft, which makes the following points:

  • While processing audio in JavaScript, it is extremely challenging to get reliable, glitch-free audio while achieving a reasonably low-latency, especially under heavy processor load.
  • JavaScript is very much slower than heavily optimized C++ code and is not able to take advantage of SSE optimizations and multi-threading which is critical for getting good performance on today's processors. Optimized native code can be on the order of twenty times faster for processing FFTs as compared with JavaScript. It is not efficient enough for heavy-duty processing of audio such as convolution and 3D spatialization of large numbers of audio sources.
  • setInterval() and XHR handling will steal time from the audio processing. In a reasonably complex game, some JavaScript resources will be needed for game physics and graphics. This creates challenges because audio rendering is deadline driven (to avoid glitches and get low enough latency). JavaScript does not run in a real-time processing thread and thus can be pre-empted by many other threads running on the system.
  • Garbage Collection (and autorelease pools on Mac OS X) can cause unpredictable delay on a JavaScript thread.
  • Multiple JavaScript contexts can be running on the main thread, stealing time from the context doing the processing.
  • Other code (other than JavaScript) such as page rendering runs on the main thread.
  • Locks can be taken and memory is allocated on the JavaScript thread. This can cause additional thread preemption.
  • The problems are even more difficult with today's generation of mobile devices which have processors with relatively poor performance and power consumption / battery-life issues.

ECMAScript (js) is really fast for a lot of things, and is getting faster all the time depending on what engine is interpreting the code. For something as intensive as audio processing however, you would be much better off using a low-level tool that's compiled to optimize resources specific to the task. I'm currently using ffmpeg on the server side to accomplish something similar.

I know that it is really inefficient to have to send a wav file across an internet connection just to obtain a more compact .ogg file, but that's the current state of things with the web audio api. To do any client-side processing the user would have to explicitly give access to the local file system and execution privileges for the file to make the conversion. Hopefully someone will address this glaring problem soon. Good luck.

Edit: You could also use Google's native-client if you don't mind limiting your users to Chrome. It seems like very promising technology that loads in a sandbox and achieves speeds nearly as good natively executed code. I'm assuming that there will be similar implementations in other browsers at some point.