Reputation: 24423

How to count the number of spoken syllables in an audio file?

I have many audio files with clean audio and only spoken voice in Mandarin Chinese. I need to estimate of how many syllables are spoken in each file. Is there a tool for OS X, Windows, or Linux that can estimate these?

sample01.wav 15
sample02.wav 8
sample03.wav 5
sample04.wav 1
sample05.wav 18

As there are many files, command-line or batch-capable software is preferred, e.g.:

$ application sample01.wav
15

A solution that uses speech-to-text, then counts the number of characters present would be suitable to.

Upvotes: 5

Answers (4)

marsei

Reputation: 7751

The automatic segmentation of speech is an active scientific domain, meaning that there is no method that works perfectly.

In 2009, de Jong and Wempe proposed a method to automatically detect syllables in a human speech signal using Praat. This methods compares well with man-made segmentation, and has been employed in many third-party scientific studies. You can find a detailed description of the method in their scientific article (pdf), along with an historical perspective on previously proposed methods. The Praat script per se and a couple of tutorials can be found on a dedicated website (www - speechrate).

You may also be interested in another segmentation algorithm developed by Harma that has been implemented in Matlab (Harma Syllable Segmentation)

Upvotes: 13

Aditya

Reputation: 2934

Your question requires specific attention and solution for Speech to Text. I really doubt any free open source library, easily available and serving to purpose will be served.

I have used one but for reverse purpose "text to speech". Though this is not a free library, i would love to help just Google "annosoft lipsync"...