Android basic TTS engine from audio file

Question

I have an mp3 file hello.mp3. I am wrapping the mp3 into FileInputStream and converting the input stream to bytes, then pushing the bytes to SynthesisCallback.audioAvailable(bytes,offset,length) but this results to just noise.The file hello.mp3 plays just fine if I load it to my Android Music play.

Why is this not working when I push bytes from the file to SnthesisCallback? I have pasted my code below.

This is where I generate the Audio stream from mp3 file:

 class AudioStream {
    InputStream stream;
    int length;
}
private AudioStream getAudioStream(String text) throws IOException {
    // TODO parse text, and generate audio file.
    File hello = new File(Environment.getExternalStorageDirectory(), "hello.mp3");
    AudioStream astream = new AudioStream();
    astream.length = hello.length();
    astream.stream = new FileInputStream(hello);
    return astream;

}

This is my Inputstream to byte[] method.

  public byte[] inputStreamToByteArray(AudioStream inStream) throws IOException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    byte[] buffer = new byte[inStream.length];
    int bytesRead;
    while ((bytesRead = inStream.stream.read(buffer)) > 0) {
        baos.write(buffer, 0, bytesRead);
    }
    return baos.toByteArray();
}

This is my onSynthesizeText method in my TextToSpeechService class.

 @Override
protected synchronized void onSynthesizeText(SynthesisRequest request,
        SynthesisCallback callback) {
    // TODO load language and other checks.

    // At this point, we have loaded the language 

    callback.start(16000,
            AudioFormat.ENCODING_PCM_16BIT, 1 /* Number of channels. */);

    final String text = request.getText().toLowerCase();
    try {
        Log.i(TAG, "Getting audio stream for text "+text);
        AudioStream aStream = getAudioStream(text);

         byte[] bytes = inputStreamToByteArray(aStream);
         final int maxBufferSize = callback.getMaxBufferSize();
         int offset = 0;
         while (offset < aStream.length) {
             int bytesToWrite = Math.min(maxBufferSize, aStream.length - offset);
             callback.audioAvailable(bytes, offset, bytesToWrite);
             offset += bytesToWrite;
         }

    } catch (Exception e) {
        e.printStackTrace();
        callback.error();
    }



    // Alright, we're done with our synthesis - yay!
    callback.done();
}

This is how I am testing my synthesis-engine-in the making.

//initialize text speech
    textToSpeech = new TextToSpeech(this, new OnInitListener() {

        /**
         * a callback to be invoked indicating the completion of the TextToSpeech
         * engine initialization.
         */
        @Override
        public void onInit(int status) {
            if (status == TextToSpeech.SUCCESS) {
                int result = textToSpeech.setLanguage(Locale.US);
                if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED) {
                    Log.e("error", "Language is not supported");
                } else {
                    convertToSpeech("Hello");
                }
            } else {
                Log.e("error", "Failed  to Initilize!");
            }
        }


        /**
         * Speaks the string using the specified queuing strategy and speech parameters.
         */
        private void convertToSpeech(String text) {
            if (null == text || "".equals(text)) {
                return;
            }
            textToSpeech.speak(text, TextToSpeech.QUEUE_FLUSH, null);
        }
    });

sinisha · Accepted Answer

The function audioAvailable(byte[] buffer, int offset, int length) expects PCM samples as input. You cannot read bytes from .mp3 file and use it as input to the function. You need to use .wav file or first convert .mp3 file to .wav file and use it as input to audioAvailable function.

Android basic TTS engine from audio file

Answers (1)

Related Questions