this is my first post on stack. So far this site has been very helpful, but I am a novice and need a clear explanation to my problem, which is related to pitch-shifting audio in Python. I have the current modules installed: numpy, scipy, pygame, and the scikits "samplerate" api.
My goal is to take a stereo file and play it back at a different pitch in as few steps as possible. Currently, I load the file into an array using pygame.sndarray, then apply a samplerate conversion using scikits.samplerate.resample, then convert the output back to a sound object for playback using pygame. The problem is garbage audio comes out of my speakers. Surely I'm missing a few steps (in addition to not knowing anything about math and audio).
Thanks.
import time, numpy, pygame.mixer, pygame.sndarray
from scikits.samplerate import resample
pygame.mixer.init(44100,-16,2,4096)
# choose a file and make a sound object
sound_file = "tone.wav"
sound = pygame.mixer.Sound(sound_file)
# load the sound into an array
snd_array = pygame.sndarray.array(sound)
# resample. args: (target array, ratio, mode), outputs ratio * target array.
# this outputs a bunch of garbage and I don't know why.
snd_resample = resample(snd_array, 1.5, "sinc_fastest")
# take the resampled array, make it an object and stop playing after 2 seconds.
snd_out = pygame.sndarray.make_sound(snd_resample)
snd_out.play()
time.sleep(2)
Your problem is that pygame works with numpy.int16
arrays but the call to resample
return a numpy.float32
array:
>>> snd_array.dtype
dtype('int16')
>>> snd_resample.dtype
dtype('float32')
You can convert resample
result to numpy.int16
using astype
:
>>> snd_resample = resample(snd_array, 1.5, "sinc_fastest").astype(snd_array.dtype)
With this modification, your python script plays the tone.wav
file nicely, at a lower pitch and a lower speed.