Reputation: 2311
I am synthesising speech using the Google cloud APIs. I have the following information about the speech synthesise response.
The response from the API is a byte array. Given this information, how could I approximate or accurately compute the length of the synthesised audio?
Upvotes: 1
Views: 3097
Reputation: 163538
You don't have enough information to compute the duration of audio.
MP3 is a lossy codec, and can operate at a number of different bitrates. In fact, that bitrate can change throughout the file. Making things worse, MP3 doesn't have any inherent timestamping in its usual format. The only real way to accurately know its length is to decode it.
Alternatively if you know the bitrate, you can divide the file by the bitrate and get an approximate length. If you can assume there is a constant bitrate in the whole file, you can get the birate by reading the header of the first frame. See also: http://mpgedit.org/mpgedit/mpeg_format/mpeghdr.htm
Upvotes: 2