Reputation: 1316
I am using IBM Watson's speech-to-text service to generate transcripts for few telephony audio files (8kHz). I have tried both wav and opus versions of the same files. I haven't seen any major degradation in the quality of transcript while using opus format. I am thinking of storing just the opus format of the files to reduce storage space requirement and to decrease file transfer time. In general is it better to use wav format for higher quality transcripts? Is there any known degradation in the quality of transcript if we use opus format?
Upvotes: 1
Views: 2614
Reputation: 795
If the bitrate is enough OPUS should not degrade the recognition accuracy. You should use the lowest bitrate that does not degrade accuracy, which can be determined experimentally (try different bitrates and compute Word Error Rate).
Alternatively you can use FLAC, which is lossless and typically offers a compression factor of 5X compared to uncompressed wav.
Finally, keep in mind that you do not want the sampling rate to be higher than 16kHz, since that wont be useful for recognition and will increase the storage considerably.
Upvotes: 5
Reputation: 535
Only you know the requirements (both present and future) for your use case, so it's hard to provide a straight answer. That being said, I've personally found opus quality to be pretty great.
Here are some links about the quality of the Opus codec that you might find interesting:
Upvotes: 1