Channels and Sample rates using python and pyaudio

Richard picture Richard · Sep 14, 2015 · Viewed 11.5k times · Source

I am trying to record and playback some audio using python and pyaudio. I am using a microphone connected to a raspberry pi with a C-Media Electronics, Inc. CM108 Audio Controller set as the default device.

This device only records in mono.

    0 - USB PnP Sound Device: USB Audio (hw:0,0)
    {'defaultSampleRate': 44100.0, 'defaultLowOutputLatency': 0.011609977324263039, 'defaultLowInputLatency': 0.011609977324263039, 'maxInputChannels': 1L, 'structVersion': 2L, 'hostApi': 0L, 'index': 0, 'defaultHighOutputLatency': 0.046439909297052155, 'maxOutputChannels': 2L, 'name': u'USB PnP Sound Device: USB Audio (hw:0,0)', 'defaultHighInputLatency': 0.046439909297052155}

Recording Code

    import pyaudio, wave, sys
    CHUNK = 8192
    FORMAT = pyaudio.paInt16
    CHANNELS = 1
    RATE = 44100
    RECORD_SECONDS = 10
    WAVE_OUTPUT_FILENAME = 'Audio_.wav'
    p = pyaudio.PyAudio()
    stream = p.open(format=FORMAT,
             channels = CHANNELS,
             rate = RATE,
             input = True,
             input_device_index = 0,
             frames_per_buffer = CHUNK)
    print("* recording")
    frames = []
    for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
        data = stream.read(CHUNK)
        frames.append(data)
    print("* done recording")
    stream.stop_stream()    # "Stop Audio Recording
    stream.close()          # "Close Audio Recording
    p.terminate()           # "Audio System Close

    wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
    wf.setnchannels(CHANNELS)
    wf.setsampwidth(p.get_sample_size(FORMAT))
    wf.setframerate(RATE)
    wf.writeframes(b''.join(frames))
    wf.close()

And I can play this fine using aplay, it shows the following

    Playing WAVE 'Audio_.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Mono

But when I try playing it using python / pyaudio using this code my problems begin.

Playback Code

    import pyaudio
    import wave
    import sys
    import time
    output_device_index = 0
    CHUNK = 1024
    if len(sys.argv) < 2:
        print("Plays a wave file.\n\nUsage: %s filename.wav" 
        % sys.argv[0])
        sys.exit(-1)

    wf = wave.open(sys.argv[1], 'rb')

    # instantiate PyAudio (1)
    p = pyaudio.PyAudio()
    def callback(in_data, frame_count, time_info, status):
        data = wf.readframes(frame_count)
        return (data, pyaudio.paContinue)

    stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
            channels=wf.getnchannels(),
            rate=wf.getframerate(),
            output=True,
            output_device_index=output_device_index,
            stream_callback=callback)

    data = wf.readframes(CHUNK)
    while stream.is_active():
        time.sleep(0.1)

    stream.stop_stream()
    stream.close()
    wf.close()
    p.terminate()

The audio that I get is played at the wrong sample rate so I sound like somthing from (Alvin and the chipmunks) and has lots of humming / buzzing. I think this is because the C-media USB sound card can not play a mono stream nativly.

When using aplay the plughw:0,0 fixes this. I have set my .asoundrc as follows so I don't have to specify this when using aplay anymore.

    pcm.plug0 {
        type plug
        slave {
            pcm "hw:0,0"
        }
    }

But this dosn't help when using python to play the audio file. Please can someone point me in the right direction.

Answer

Bas van Dijk picture Bas van Dijk · Oct 19, 2015

You can try to set the number of channels in the output stream to two. You would then have to duplicate every 2 bytes.

Assuming your width is 2 (16bit audio) the stream you get from your wavfile (as string of bytes) will look like this:

B1a B1b B2a B2b B3a B3b ... etc

what you need to stream down is this (provided you want output on both channels:

B1a B1b B1a B1b B2a B2b B2a B2b B3a B3b B3a B3b

if you try to stream the first stream to a stereo device it will sound twice the pitch as the even samples go to the left channel and the odd channels go to the right, and both channels only get 1/2 of the samples.