I'm performing an asynchronous request to Google Cloud Speech API, and I do not know how to get the result of operation:
Request POST: https://speech.googleapis.com/v1beta1/speech:asyncrecognize
Body:
{
"config":{
"languageCode" : "pt-BR",
"encoding" : "LINEAR16",
"sampleRate" : 16000
},
"audio":{
"uri":"gs://bucket/audio.flac"
}
}
Which returns:
{ "name": "469432517" }
So, I do a POST: https://speech.googleapis.com/v1beta1/operations/469432517
Which returns:
{
"name": "469432517",
"metadata": {
"@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeMetadata",
"progressPercent": 100,
"startTime": "2016-08-11T21:18:29.985053Z",
"lastUpdateTime": "2016-08-11T21:18:31.888412Z"
},
"done": true,
"response": {
"@type": "type.googleapis.com/google.cloud.speech.v1beta1.AsyncRecognizeResponse"
}
}
I need to get the result of the operation: the transcribed text.
How can I do that?
You've got the result of the operation and it is empty. The reason of the empty result is format mismatch. You should have submitted "LINEAR16" file (PCM uncompressed data, basically WAV file) and you try to submit FLAC (compressed format).
Other reason of the empty result might be incorrect sample rate, incorrect number of channels and so on.
Last, the file with pure silence will result in empty response.