Determine fundamental frequency of speaking voice

Question

I am trying to determine the perceived pitch of an audio sample (voice only, no background or music) to then identify the voice as bass, tenor, alto, mezzo-soprano, soprano.

To do so, I use aubio which returns a list of timecodes and the respective frequency of any given audio file.

I struggle to find the best way how to use the data to determine the pitch. My initial idea is either simply not good or badly executed:

I take the list of frequencies returned by aubio and calculate the median like this:

exec('aubiopitch /pathtomp3file/audio.mp3',$output);

// iterate through the time/frequencies returned by aubio
// $output is a list of number pairs (one pair per line):
// The timecode followed by a whitespace followed by the frequency
// at that timecode in hertz.

foreach($output as $sample) {

    // extract frequency information
    $freq_sample=substr($sample,strpos($sample,' '));

    // add frequency to array
    $freqs[]=floor($freq_sample);

}       

// to calculate median frequency: sort array with frequencies
// and fetch the element in the middle

sort($freqs);
$median=$freqs[floor(count($freqs)/2)];

I then take the median frequency found and map it to "bass", "baritone", "tenor", "alto" etc.

Unfortunately, the results are inconsistent. Too many times, the median frequency of a very deep voice, for example, comes out way too high.

I believe the way I try to determine the fundamental frequency has a flaw but I struggle to come up with a better approach.

The following questions arise, for example:

Should I discard any frequencies above, for example, 400hz, as they are probably from sounds like "s", etc.?
When humans perceive the pitch of a voice, what is that we actually listening for? The fundamental frequency? The energy of certain frequencies?

The overall question that sums it up would be:

"Using aubio's data, what is the correct programmatical approach to calculate the perceived pitch of voice recording (talking, not singing)?"

EDIT – HOW I USE AUBIO

exec('aubiopitch /pathtomp3file/audio.mp3',$output);

Determine fundamental frequency of speaking voice

Answers (0)

Related Questions