Pietro Passarelli
Pietro Passarelli

Reputation: 41

timestamps seem to allow 0 seconds duration for some words in results, is this a bug?

When using the google cloud speech api, the new word accurate timestamps/timecode feature, seem to allow 0 seconds duration for some words in results, here is an example

... { startTime: '48.800s', endTime: '48.800s', word: 'a' }, { startTime: '48.800s', endTime: '49.200s', word: 'kindly' }, ...

is this a bug?

To test I used a clip from audio archive "Arthur the Rat", "USA - General mid-western speaker (Michigan)".

Upvotes: 4

Views: 168

Answers (2)

NickDGreg
NickDGreg

Reputation: 484

David Anderson's answer is correct, I just thought I'd elaborate it as I initially thought the response is only to the second precision and not 100ms as the docs describe.

As of July 2018, sending a request to the google cloud speech API including word time offsets returns a response object where each word result in response.results has the structure:

start_time {
  seconds: 24
  nanos: 100000000
}
end_time {
  seconds: 24
  nanos: 700000000
}
word: "of"

The nanos field allows you to get the start and end time to the 100ms precision. So you can obtain the start and end times like so:

print(start_time.seconds + start_time.nanos * 1e-9)
print(end_time.seconds + end_time.nanos * 1e-9)

==== Output ====

24.1
24.7

Upvotes: 1

David Anderson
David Anderson

Reputation: 11

you can get better than second precision using the returned timestamp.

you get the start time out of the structure containing the word and you can output it in the following way:

start_time.seconds + start_time.nanos * 1e-9

Upvotes: 1

Related Questions