I'm capturing audio from my mic with Pyaudio and trying to encode/decode it with the opus codec. I'm using the bindings to libopus made by SvartalF ( https://github.com/svartalf/python-opus ).
Here is my code :
import pyaudio
from opus import encoder, decoder
def streaming(p):
chunk = 960
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 48000
streamin = p.open(format = FORMAT,
channels = CHANNELS,
rate = RATE,
input = True,
input_device_index = 7,
frames_per_buffer = chunk)
streamout = p.open(format = FORMAT,
channels = CHANNELS,
rate = 48000,
output = True,
output_device_index = p.get_default_input_device_info()["index"],
frames_per_buffer = chunk)
enc = encoder.Encoder(RATE,CHANNELS,'voip')
dec = decoder.Decoder(RATE,CHANNELS)
data = []
for i in xrange(100):
data.append(streamin.read(chunk*2))
streamout.write(''.join(data))
encdata = []
for x in data:
encdata.append(enc.encode(x,chunk))
print "DATA LENGTH :", len(''.join(data))
print "ENCDATA LENGTH :", len(''.join(encdata))
decdata = ''
for x in encdata:
decdata += dec.decode(x,chunk)
print "DECDATA LENGTH :", len(decdata)
streamout.write(decdata)
streamin.close()
streamout.close()
p = pyaudio.PyAudio()
streaming(p)
p.terminate()
I must put chunk*2
instead of chunk
in data.append(streamin.read(chunk*2))
or DECDATA LENGTH == DATA LENGTH*2
and I don't know why.
Outputs :
DATA LENGTH : 384000
ENCDATA LENGTH : 12865
DECDATA LENGTH : 384000
Without encoding/decoding, the first streamout.write(''.join(data))
works perfectly. With encoding/decoding, the streamout.write(decdata)
kinda works but has a lot of cracklings mixed in.
What am I doing wrong here?
This appears to be caused by a bug in python-opus in the decode methods.
According to the Opus API, opus_decode returns the number of samples decoded. The python bindings assume it will completely fill the result buffer it passes in so there is a silence appended to each set of decoded samples. This silence causes the cracking at low frame sizes and a stutter at higher frame sizes. While the documentation doesn't say anything about it, it appears that the returned number is per channel.
Changing line 150 of opus/api/decoder.py to the following fixes it for me:
return array.array('h', pcm[:result*channels]).tostring()
The decode_float method probably needs the same change if you need to use that.