I am looking to convert a .wav file recorded through an android phone at 16000 to text using C#; namely the System.Speech namespace. My code is mentioned below;
recognizer.SetInputToWaveFile(Server.MapPath("~/spoken.wav"));
recognizer.LoadGrammar(new DictationGrammar());
RecognitionResult result = recognizer.Recognize();
label1.Text = result.Text;
The is working perfectly with sample .wav "Hello world" file. However when i record something on teh phone and try to convert to on the pc, the converted text is no where close to what i had recoreded. Is there some way to make sure the audio file is transcribed accurately?
What format is the phone's audio file recorded in? Is the file encoded? Microsoft recognizer supports PCM, ALaw, and ULaw. Make sure you are recording in a supported format. You can look at the RecognizerInfo.SupportedAudioFormats Property - http://msdn.microsoft.com/en-us/library/system.speech.recognition.recognizerinfo.supportedaudioformats(v=VS.90).aspx and check the formats your recognizer version supports.
Did you listen to the file you recorded on your phone? Is it noisy? Does it sound clear? Make sure you are feeding the recognizer the best sounding audio you can.
Since you are using a Dictation grammar, I'm assuming you're using Windows 7. Have you tried training the recognizer? My understanding is that the dictation grammar performance can be improved by training and that the standard Windows 7 speech recognition training will help its performance - http://windows.microsoft.com/en-US/windows7/Set-up-Speech-Recognition
Some other questions on StackOverflow may also give you some insights. See good Speech recognition API to start.