Reputation: 151
I am trying to implement Speech-to-text recognition in my React website, and I am using the react-speech-recognition
package from npm. I am using the exact code they have specified in the package description over here: npm
Now it works with everyday speech, anything I say, but when I induce technical jargon, it goes way off!
Here's what I am trying to say to it, it's aviation jargon:
Cleared to enter the CTR, not above 1500 feet, join and report on a right downwind runway 19, QNH 1018, squak 2732
This is what I get in response:
please to enter the city are not above 15 feet heart penetrate join and report on a ride on the wind blown away 9 theme
Upvotes: 2
Views: 1337
Reputation: 323
Here are some tricks that can help:
For Accurate and continuous Results:
const {
transcript,
finalTranscript,
listening,
resetTranscript,
browserSupportsSpeechRecognition } = useSpeechRecognition({
lang: "en-IN", // Set the language to Indian English
interimResults: true, // Get partial results
continuous: true, // Enable continuous recognition
maxAlternatives: 5, // Set the number of alternative transcriptions
});
SpeechRecognition.startListening({ continuous: true, language: 'en-IN' }); // Set the language to Indian English
The react-speech-recognition library allows you to specify the language and dialect for the speech recognition engine. It's essential to select the appropriate language and dialect that matches the user's speech patterns. For example, if your application is targeting users with Indian English accents, you should set the language to 'en-IN' (English - India) to optimize the recognition accuracy.
For Lengthy Conversations:
const {
transcript,
finalTranscript,
listening,
resetTranscript,
browserSupportsSpeechRecognition } = useSpeechRecognition({
lang: "en-IN", // Set the language to Indian English
interimResults: true, // Get partial results
continuous: true, // Enable continuous recognition
maxAlternatives: 5, // Set the number of alternative transcriptions
abortController: new AbortController(), // Create a new AbortController instance});
const [abortController, setAbortController] = useState(new
AbortController());
const handleStopRecording = () => {
stopRecording();
setAbortController(new AbortController()); // Create a new AbortController instance};
const stopRecording = () => {
setRecordingStatus("inactive");
abortController.abort(); // Abort the speech recognition
SpeechRecognition.stopListening();
// rest of the logic
}
lengthy and speedy conversations
Memoize startRecording and stopRecording Functions using useCallback hook moved the creation of the abortController instance to the handleStartRecording and handleStopRecording functions. This ensures that a new abortController is created every time.
const stopRecording = useCallback(() => {...,[abortController]}
const startRecording = useCallback(() => {...,[abortController]}
const handleStartRecording = () => {
startRecording();
setAbortController(new AbortController()); // Create a new AbortController instance
};
const handleStopRecording = () => {
stopRecording();
setAbortController(new AbortController()); // Create a new AbortController instance
};
Upvotes: 0
Reputation: 156624
That package leverages the Speech Recognition Interface of your browser's Web Speech API. The React Library's API allows you to get the underlying SpeechRecognition
object via a call to the getRecognition()
method.
The underlying SpeechRecognition object's API allows for the addition of Grammars using the JSpeech Grammar Format. Here's an example. So in theory, you could provide more information about the words you're expecting to hear in your app, and thereby improve performance.
But there are caveats, including:
You may be able to get better accuracy from cloud-based speech services. Azure Cognitive Services, for example, allows you to create custom voice models, custom grammars, etc. Of course, they also charge you based on usage, and they charge more if you're using customizations.
Upvotes: 3