How to split male and female voices from an audio file(in c++ or java)

manisha chawla picture manisha chawla · Mar 3, 2009 · Viewed 10.4k times · Source

I want to differentiate betwen the male n female voices in an audio file and seperate them.As an output I want the two voices seperated.Can u please help me out n can the coding be done in java or c++

Answer

thomasrutter picture thomasrutter · Mar 3, 2009

This is potentially a very complicated question, and it is similar to writing your own speech recognition (or identification) algorithm.

You would start by converting the audio into the frequency domain, which is done using a Fast Fourier Transform.

For each slice in time that you take an FFT, this will give you a list of frequencies and their amplitudes. You will somehow need to detect the fundamental tone by analysing the harmonics. The 2nd and 3rd harmonics will be clearest. It's very hard to figure out which harmonics they are, especially with the background noise and the natural difference between people's voices in terms of which harmonics are loudest. Then you can try to determine if the speaker is male or female by whatever you guessed the fundamental tone to be.

Keep in mind that during many parts of speech like sibilance ('s', 't', etc) there is no tone, just noise. It will need to be pretty intelligent.

Hope that sets you in the right general direction.

Note: if the two voices are simultaneous and you want to separate them cleanly, then this won't help you. I don't believe anyone alive has solved such a problem.