Reputation: 11
When running a speech to text api request from Google cloud services (over 60s audio so i need to use the long_running_recognize function, as well as retrieve the audio from a Cloud Storage Bucket), i properly get a text response, but i cannot iterate through the LongRunningResponse object that is returned, which renders the info inside semi useless.
When using just the "client.recognize()" function, i get a similar response to the long running response, except when i check for the results in the short form, i can iterate through the object just fine, contrary to the long response.
I run nearly identical parameters through each recognize function (a 1m40s long audio for long running, and a 30s for the short recognize, both from my cloud bucket).
short_response = client.recognize(config=config, audio=audio_uri)
subs_list = []
for result in short_response.results:
for alternative in result.alternatives:
for word in alternative.words:
if not word.start_time:
start = 0
else:
start = word.start_time.total_seconds()
end = word.end_time.total_seconds()
t = word.word
subs_list.append( ((float(start),float(end)), t) )
print(subs_list)
Above function works fine, the ".results" attribute correctly returns objects that i can further gain attributes from and iterate through. I use the for loops to create subtitles for a video. I then try a similar thing on the long_running_recognize, and get this:
long_response = client.long_running_recognize(config=config, audio=audio_uri)
#1
print(long_response.results)
#2
print(long_response.result())
Output from #1 returns error: AttributeError: 'Operation' object has no attribute 'results'. Did you mean: 'result'?
Output from #2 returns the info i need, but when checking "type(long_response.result())" i get: <class 'google.cloud.speech_v1.types.cloud_speech.LongRunningRecognizeResponse'>
Which i suppose is not an iterable object, and i cannot figure out how to apply a similar process as i do to the recognize function to gain subtitles the way i need.
Upvotes: 1
Views: 125
Reputation: 2183
It took me some time to figure out, you will have to parse the response yourself. Here is the code:
def serialize_response(response):
result_dict = {
"results": []
}
for result in response.results:
# Each result contains a list of alternatives
alternatives = [
{
"transcript": alternative.transcript,
"confidence": alternative.confidence,
"words": [{
"start_time": word.start_time.seconds,
"end_time": word.end_time.seconds,
"word": word.word,
"speaker_tag": word.speaker_tag
} for word in alternative.words]
} for alternative in result.alternatives
]
result_dict["results"].append({
"alternatives": alternatives
})
return result_dict
Upvotes: 1