dlsnfl37
dlsnfl37

Reputation: 23

json.dumps ValueError: Extra data unable to dump json outputs

I am trying to use speech to text watson api, but when I set the interim_results = True I got the value error. Please help me:)

with open(join(dirname(__file__), './audio-file.wav'), 'rb') as audio_file:
 print(json.dumps(speech_to_text.recognize(
     audio_file, content_type='audio/wav', timestamps=True, interim_results =True, word_confidence=True), indent=2))

The error output:

The error output

However when I set the interim_results = False I got the output which was properly working.

The output when the interim_results = False:

The output when the interim_results=False

I think that the reason will be related with multiple json outputs but I don't know how to solve it because this is json.dumps:) So I can not refer json.loads value error cases to solve this case.

Upvotes: 1

Views: 624

Answers (1)

Nathan Friedly
Nathan Friedly

Reputation: 8166

With interim_results=true, the service sends back multiple JSON blobs, with the expectation that you will parse them individually as they arrive. This is useful if you want to, for example, display near-realtime transcriptions.

If you're just doing a one-off transcription and don't need to display the text in near-real-time, I would recommend leaving interim_results set to false.

You could split the result around }\s*{ (where one JSON blob ends and the next one begins), and then parse each individual chunk as JSON (restoring the } and {s if necessary), but it wouldn't really gain you anything since the complete final results would already be there.

Alternatively, if you do need/want near-realtime updates, the WebSocket interface makes this a little easier because each JSON chunk arrives in it's own message - check out https://github.com/watson-developer-cloud/speech-to-text-websockets-python for an example.

Upvotes: 1

Related Questions