AtActionPark
AtActionPark

Reputation: 127

Javascript - text to speech with pitch and duration control

I've been looking at making my javascript program sing.

I first looked at the web speech api, but the pitch control seems very limited, so I thought maybe there is a way to send the result to a web audio node, and apply effects from there but it doesn't seems possible.

I found the mespeak.js library: http://www.masswerk.at/mespeak/

It allow to return a audio buffer that I'll treat as a source of my audio nodes, allowing for more control.

My input is a notes sequence, with a frequency and duration. Something like :

var seq = [[440hz,1000ms],[880hz,500ms],...];

I managed to get from this sequence and a series of words to my program saying those words in rhythm with different frequencies

But I'm having a few problems.

If any of you had any experience with that sort of stuff I'd appreciate any input.

Thanks a lot

EDIT: add some code

function sing(text,note,duration){
  var buffer = meSpeak.speak(text,{rawdata:'default'});
  playSound(buffer,freqToCents(note),duration)
}

function freqToCents(freq){
  var root = 440 //no idea what is the base frequency of the speech generator
  return 3986*Math.log10(freq/440)
}

function playSound(streamBuffer, cents, duration, callback) { 
  var source = context.createBufferSource();
  source.connect(compressor);

  context.decodeAudioData(streamBuffer, function(audioData) { 
    var duration = audioData.duration; 
    var delay = (duration)? Math.ceil(duration * 1000) : 1000;
    setTimeout(callback, delay);
    source.buffer = audioData;
    source.detune.value = cents; 

    source.start(0);
  }, function(error) { }); 
}

My sequencer is working, and at each step, calls the sings function if necessary, for example like this:

sing('test', 440, 1000)

As I was saying, I'd like the duration parameter to impact the result

Upvotes: 3

Views: 1311

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25220

Espeak supports SSML mode, you need to use it to modify parameters instead of trying to postprocess results.

You need to play with espeak first and then try to reproduce same results in javascript port. It is not supported yet, but in this part in mespeak.js

  '-w', 'wav.wav',
      '-a', (typeof args.amplitude !== 'undefined')? String(args.amplitude) : (typeof args.a !== 'undefined')? String(args.a) : '1
      '-g', (typeof args.wordgap !== 'undefined')? String(args.wordgap) : (typeof args.g !== 'undefined')? String(args.g) : '0',
      '-p', (typeof args.pitch !== 'undefined')? String(args.pitch) : (typeof args.p !== 'undefined')? String(args.p) : '50',
      '-s', (typeof args.speed !== 'undefined')? String(args.speed) : (typeof args.s !== 'undefined')? String(args.s) : '175',

You need to add -m option to enable SSML.

Upvotes: 1

Related Questions