As you probably know, implementing speech-to-text is pretty easy with the Android API. All you have to do is just call up the API's intent and it will return text for you. My case is a bit different, I have a prerecorded 3GPP sound file that I've recorded from the user and is saved on the SD card. I want to know if it's possible to transcribe that into text like any other speech recognition. Does the speech-to-text API allow for uploading you're own sound files to be processed? Or is this impossible?
The API does not allow it, but see this blog post and its comments for a potential workaround. Also make sure that your file contains high quality audio (at least 16 bit and 16 kHz) to get a better transcription.
See also: