I am working on a college project in which I am using speech recognition. Currently I am developing it on Windows 7 and I'm using system.speech API package which comes along with .net and I am doing it on C#.
The problem I am facing is dictation recognition is not accurate enough. Then whenever I start my application the desktop speech recognition starts automatically. This is a big nuicance to me. As already the words I speak are not clear enough and conflicting recognition are interpreted as commands and actions like application switching minimize is being carried out.
This is a critical part of my app and i kindly request you to suggest any good speech API for me other than this Microsoft blunder. It will be good even if it can understand just simple dictation grammar.
I think desktop recognition is starting because you are using a shared desktop recognizer. You should use an inproc recognizer for your application only. you do this by instantiating a SpeechRecognitionEngine() in your application.
Since you are using the dictation grammar and the desktop windows recognizer, I believe it can be trained by the speaker to improve its accuracy. Go through the Windows 7 recognizer training and see if the accuracy improves.
To get started with .NET speech, there is a very good article that was published a few years ago at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. It is probably the best introductory article I’ve found so far. It is a little out of date, but very helfpul. (The AppendResultKeyValue method was dropped after the beta.)
Here is a quick sample that shows one of the simplest .NET windows forms app to use a dictation grammar that I could think of. This should work on Windows Vista or Windows 7. I created a form. Dropped a button on it and made the button big. Added a reference to System.Speech and the line:
using System.Speech.Recognition;
Then I added the following event handler to button1:
private void button1_Click(object sender, EventArgs e)
{
SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine();
Grammar dictationGrammar = new DictationGrammar();
recognizer.LoadGrammar(dictationGrammar);
try
{
button1.Text = "Speak Now";
recognizer.SetInputToDefaultAudioDevice();
RecognitionResult result = recognizer.Recognize();
button1.Text = result.Text;
}
catch (InvalidOperationException exception)
{
button1.Text = String.Format("Could not recognize input from default aduio device. Is a microphone or sound card available?\r\n{0} - {1}.", exception.Source, exception.Message);
}
finally
{
recognizer.UnloadAllGrammars();
}
}
A little more information comparing the various flavors of speech engines and APIs shipped by Microsoft can be found at What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition??