Android speech recognizing and audio recording in the same time

woyaru picture woyaru · Aug 23, 2011 · Viewed 21.9k times · Source

My application records audio using MediaRecorder class in AsyncTask and also use Google API transform speech to text - Recognizer Intent - using the code from this question : How can I use speech recognition without the annoying dialog in android phones

I have tried also to record audio in Thread, but this is worse solution. It causes more problems. My problem is that my application works properly on emulator. But emulator don't supports speech reocognition because of lack of voice recognition services. And on my device my application has crash when I starts recording audio and speech reognizing - "has stopped unexpectedly". However when I have wifi turned off, application works properly like on emulator.

Recording audio requires in AndroidManifest:

<uses-permission android:name="android.permission.RECORD_AUDIO" />

and speech recognition requiers:

<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.INTERNET" />

I suppose this is problem with single audio input? How can I resolve this problem? Google Speech Recognizer requiers to work in main UI thread, so I can't for example do it in Async Task. So I have audio recording in Async Task. I don't have idea why this causes problems.

I have connected my device to Eclipse and I have used USB debugging. And this is execption I have in LogCat:

08-23 14:50:03.528: ERROR/ActivityThread(12403): Activity go.android.Activity has leaked ServiceConnection android.speech.SpeechRecognizer$Connection@48181340 that was originally bound here
08-23 14:50:03.528: ERROR/ActivityThread(12403): android.app.ServiceConnectionLeaked: Activity go.android.Activity has leaked ServiceConnection android.speech.SpeechRecognizer$Connection@48181340 that was originally bound here
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$PackageInfo$ServiceDispatcher.<init>(ActivityThread.java:1121)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$PackageInfo.getServiceDispatcher(ActivityThread.java:1016)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ContextImpl.bindService(ContextImpl.java:951)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.content.ContextWrapper.bindService(ContextWrapper.java:347)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.speech.SpeechRecognizer.startListening(SpeechRecognizer.java:267)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at go.android.Activity.startRecordingAndAnimation(Activity.java:285)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at go.android.Activity.onResume(Activity.java:86)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.Instrumentation.callActivityOnResume(Instrumentation.java:1151)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.Activity.performResume(Activity.java:3823)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.performResumeActivity(ActivityThread.java:3118)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.handleResumeActivity(ActivityThread.java:3143)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.handleLaunchActivity(ActivityThread.java:2684)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.access$2300(ActivityThread.java:125)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread$H.handleMessage(ActivityThread.java:2033)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.os.Handler.dispatchMessage(Handler.java:99)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.os.Looper.loop(Looper.java:123)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at android.app.ActivityThread.main(ActivityThread.java:4627)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at java.lang.reflect.Method.invokeNative(Native Method)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at java.lang.reflect.Method.invoke(Method.java:521)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:858)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:616)
08-23 14:50:03.528: ERROR/ActivityThread(12403):     at dalvik.system.NativeStart.main(Native Method)

And after that another exception:

08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): Failed to create session
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): com.google.android.voicesearch.speechservice.ConnectionException: POST failed
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:176)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:88)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.ServerConnectorImpl.createTcpSession(ServerConnectorImpl.java:118)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.ServerConnectorImpl.createSession(ServerConnectorImpl.java:98)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.runRecognitionMainLoop(RecognitionController.java:679)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.startRecognition(RecognitionController.java:463)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController.access$200(RecognitionController.java:75)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.RecognitionController$1.handleMessage(RecognitionController.java:300)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.Handler.dispatchMessage(Handler.java:99)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.Looper.loop(Looper.java:123)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.os.HandlerThread.run(HandlerThread.java:60)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412): Caused by: java.net.SocketTimeoutException
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.harmony.luni.net.PlainSocketImpl.read(PlainSocketImpl.java:564)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.harmony.luni.net.SocketInputStream.read(SocketInputStream.java:88)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:103)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:191)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:82)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:174)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:179)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:235)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:259)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:279)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:121)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:410)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:555)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:487)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:465)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at android.net.http.AndroidHttpClient.execute(AndroidHttpClient.java:243)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     at com.google.android.voicesearch.speechservice.SpeechServiceHttpClient.post(SpeechServiceHttpClient.java:167)
08-23 14:50:08.000: ERROR/ServerConnectorImpl(12412):     ... 10 more
08-23 14:50:08.000: ERROR/RecognitionController(12412): Ignoring error 2

Answer

lsantsan picture lsantsan · Apr 17, 2014

I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.

I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.

Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:

"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a “live” chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."

However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording

I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.

Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.

The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.