Python : PortAudio + Opus encoding/decoding

Nolhian picture Nolhian · Jul 18, 2013 · Viewed 8.2k times · Source

I'm capturing audio from my mic with Pyaudio and trying to encode/decode it with the opus codec. I'm using the bindings to libopus made by SvartalF ( https://github.com/svartalf/python-opus ).

Here is my code :

import pyaudio
from opus import encoder, decoder

def streaming(p):
    chunk = 960
    FORMAT = pyaudio.paInt16
    CHANNELS = 1
    RATE = 48000
    streamin = p.open(format = FORMAT,
            channels = CHANNELS, 
            rate = RATE, 
            input = True,
            input_device_index = 7,
            frames_per_buffer = chunk)
    streamout = p.open(format = FORMAT,
            channels = CHANNELS, 
            rate = 48000, 
            output = True,
            output_device_index = p.get_default_input_device_info()["index"],
            frames_per_buffer = chunk)
    enc = encoder.Encoder(RATE,CHANNELS,'voip')
    dec = decoder.Decoder(RATE,CHANNELS)
    data = []
    for i in xrange(100):
        data.append(streamin.read(chunk*2))
    streamout.write(''.join(data))
    encdata = []
    for x in data:
        encdata.append(enc.encode(x,chunk))
    print "DATA LENGTH :", len(''.join(data))
    print "ENCDATA LENGTH :", len(''.join(encdata))
    decdata = ''
    for x in encdata:
        decdata += dec.decode(x,chunk)
    print "DECDATA LENGTH :", len(decdata)
    streamout.write(decdata)
    streamin.close()
    streamout.close()


p = pyaudio.PyAudio()
streaming(p)
p.terminate()

I must put chunk*2 instead of chunk in data.append(streamin.read(chunk*2)) or DECDATA LENGTH == DATA LENGTH*2 and I don't know why.

Outputs :

DATA LENGTH :    384000  
ENCDATA LENGTH : 12865  
DECDATA LENGTH : 384000

Without encoding/decoding, the first streamout.write(''.join(data)) works perfectly. With encoding/decoding, the streamout.write(decdata) kinda works but has a lot of cracklings mixed in.

What am I doing wrong here?

Answer

evilbungle picture evilbungle · Oct 20, 2013

This appears to be caused by a bug in python-opus in the decode methods.

According to the Opus API, opus_decode returns the number of samples decoded. The python bindings assume it will completely fill the result buffer it passes in so there is a silence appended to each set of decoded samples. This silence causes the cracking at low frame sizes and a stutter at higher frame sizes. While the documentation doesn't say anything about it, it appears that the returned number is per channel.

Changing line 150 of opus/api/decoder.py to the following fixes it for me:

    return array.array('h', pcm[:result*channels]).tostring()

The decode_float method probably needs the same change if you need to use that.