Reputation: 174
Hi I have a front end application in react js. I using Elevenlabs to convert the text into audio. I am using streaming API of Elevenlabs so that I don't have to wait for the whole audio and can start playing it as soon as I receive it. I have used fetch to call the API and process the incoming audio data using AudioContext class.
My code is working but it has 3 problems occurring randomly
Sometimes it just don't work
Sometimes it works but the audio stops in between throwing following error: "Uncaught (in promise) DOMException: Failed to execute 'decodeAudioData' on 'BaseAudioContext': Unable to decode audio data"
Sometimes it works fine but the audio is not smooth/seamless there are some cuts and pauses while streaming.
Following is my code
import {useEffect} from "react";
function App() {
function getStreamAudio() {
let textResponse='Hey, have you been keeping up with the latest in the crypto world? Its been incredible to see how much its grown over the past few years. I get where you are coming from, but I truly believe this is the future of finance. The decentralized nature of cryptocurrencies means no more relying on traditional banks and intermediaries. Its all about financial empowerment for the masses. You have got a point there, but remember, every technology has its challenges in the beginning. The scams and volatility will decrease as the industry matures. Plus, the potential for blockchain technology beyond just currency is immense – supply chain management, healthcare, even voting systems could benefit.'
const streamingURL = "https://api.elevenlabs.io/v1/text-to-speech/MY-AUDIO-ID/stream?optimize_streaming_latency=3";
const req = new Request(streamingURL, {
method: 'POST',
body: JSON.stringify({text: textResponse}),
headers: {
'Content-Type': 'application/json',
'xi-api-key': 'MY-API-KEY',
},
});
const audioContext = new AudioContext()
fetch(req).then((resp: any) => {
const audioContext = new AudioContext()
let startTime = 0
const reader = resp.body.getReader()
const read = async () => {
await reader.read().then(({done, value}: { done: any, value: any }) => {
if (done) {
console.log("THE STREAM HAS ENDED")
return
}
audioContext.decodeAudioData(value.buffer, (audioBuffer) => {
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);
// Wait for the current audio part to finish playing
source.onended = () => {
console.log("Ended playing: " + Date.now())
read()
};
if (startTime == 0) {
startTime = audioContext.currentTime + 0.1 //adding 50ms latency to work well across all systems
}
source.start(audioContext.currentTime)
startTime = startTime + source.buffer.duration
});
})
}
read()
})
}
const playAudio = async () => {
getStreamAudio()
};
return (
<div>
<h1>Streaming Test</h1>
<button onClick={playAudio}>Click here to Play</button>
</div>
);
}
export default App;
A help on how I can improve my code would be appreciated. Thank you.
Upvotes: 6
Views: 5099
Reputation: 21
Maybe you can use elevenlabs-js npm package. It can be more useful for you.
You can use like that:
const elevenLabs = require('elevenlabs-js');
// Set your API key
elevenLabs.setApiKey('YOUR_API_KEY');
elevenLabs.textToSpeech("YOUR_VOICE_ID", "Hello World!", "elevenlabs_multilingual_v2", {
stability: 0.95,
similarity_boost: 0.75,
style: 0.06,
use_speaker_boost: true
}).then(async (res) => {
const pipe = await res.pipe;
console.log("pipe", pipe);
// you can use pipe for what you want
});
If you need help, I'd like to help.
Upvotes: 2