Convert Audio File to text using System.Speech

Kushal Kalambi picture Kushal Kalambi · May 10, 2011 · Viewed 10.1k times · Source

I am looking to convert a .wav file recorded through an android phone at 16000 to text using C#; namely the System.Speech namespace. My code is mentioned below;

recognizer.SetInputToWaveFile(Server.MapPath("~/spoken.wav"));
recognizer.LoadGrammar(new DictationGrammar());
RecognitionResult result = recognizer.Recognize();
label1.Text = result.Text;

The is working perfectly with sample .wav "Hello world" file. However when i record something on teh phone and try to convert to on the pc, the converted text is no where close to what i had recoreded. Is there some way to make sure the audio file is transcribed accurately?

Answer

Michael Levy picture Michael Levy · May 10, 2011

What format is the phone's audio file recorded in? Is the file encoded? Microsoft recognizer supports PCM, ALaw, and ULaw. Make sure you are recording in a supported format. You can look at the RecognizerInfo.SupportedAudioFormats Property - http://msdn.microsoft.com/en-us/library/system.speech.recognition.recognizerinfo.supportedaudioformats(v=VS.90).aspx and check the formats your recognizer version supports.

Did you listen to the file you recorded on your phone? Is it noisy? Does it sound clear? Make sure you are feeding the recognizer the best sounding audio you can.

Since you are using a Dictation grammar, I'm assuming you're using Windows 7. Have you tried training the recognizer? My understanding is that the dictation grammar performance can be improved by training and that the standard Windows 7 speech recognition training will help its performance - http://windows.microsoft.com/en-US/windows7/Set-up-Speech-Recognition

Some other questions on StackOverflow may also give you some insights. See good Speech recognition API to start.