docksdocks
docksdocks

Reputation: 118

Use @aws-sdk/client-transcribe-streaming with react-native and expo-av to transcribe audios

I am currently using expo with react-native and I am facing some difficulties in implementing the usage of @aws-sdk/client-transcribe-streaming.

I saw an implementation with 'react-native-live-audio-stream' but since I am using 'expo', I thought 'expo-av' would be a better approach because 'react-native-sound' causes some issues working with 'expo'.

To start I implemented the client of '@aws-sdk/client-transcribe-streaming' in my 'client/transcribe-client.ts':

import { TranscribeStreamingClient } from '@aws-sdk/client-transcribe-streaming';
import { REGION, credentials } from '../handler';

export const transcribeStreamingClient = new TranscribeStreamingClient({
    credentials,
    region: REGION
});


Then this is the example of using react-native-live-audio-stream I got, my 'transcribe.ts':

import {
    StartMedicalStreamTranscriptionCommand,
    TranscribeStreamingClient
} from '@aws-sdk/client-transcribe-streaming';
import { Buffer } from 'buffer';
import { NativeEventEmitter, PermissionsAndroid } from 'react-native';
import 'react-native-get-random-values';
import LiveAudioStream from 'react-native-live-audio-stream';
import 'react-native-url-polyfill/auto';
import { transcribeStreamingClient } from './client/transcribe-client';

class TranscribeController {
    private isStarted: boolean;

    private emitter;

    private audioPayloadStream: Buffer[];

    private audioStream: typeof LiveAudioStream;

    private transcribeStreamingClient: TranscribeStreamingClient;

    constructor() {
        this.isStarted = false;
        this.emitter = new NativeEventEmitter();
        this.audioPayloadStream = [];
        this.audioStream = LiveAudioStream;
        this.transcribeStreamingClient = transcribeStreamingClient;
    }

    async init() {
        await PermissionsAndroid.requestMultiple([
            PermissionsAndroid.PERMISSIONS.RECORD_AUDIO
        ]);
        this.audioStream.init({
            sampleRate: 44100,
            channels: 1,
            audioSource: 6,
            bitsPerSample: 16,
            bufferSize: 2048,
            wavFile: 'test.wav'
        });
        this.audioStream.start();
        this.audioStream.on('data', (data) => {
            this.audioPayloadStream.push(Buffer.from(data, 'base64'));
            if (!this.isStarted && this.audioPayloadStream.length !== 0) {
                this.start();
            }
        });
    }

    async start() {
        try {
            this.isStarted = true;
            const command = new StartMedicalStreamTranscriptionCommand({
                LanguageCode: 'en-US',
                MediaEncoding: 'pcm',
                Specialty: 'PRIMARYCARE',
                Type: 'DICTATION',
                MediaSampleRateHertz: 44100,
                AudioStream: this.audioGenerator()
            });
            const data = await this.transcribeStreamingClient.send(command);

            for await (const event of data.TranscriptResultStream!) {
                const results = event.TranscriptEvent?.Transcript?.Results;
                if (results && results.length > 0) {
                    const [result] = results;
                    const final = !result.IsPartial;
                    const alternatives = result.Alternatives;
                    if (alternatives && alternatives.length > 0) {
                        const [alternative] = alternatives;
                        const text = alternative.Transcript;
                        this.emitter.emit('recognized', { text, final });
                    }
                }
            }
        } catch (e) {
            console.log(e);
            if (this.isStarted) this.emitter.emit('error', true);
            this.stop();
        }
    }

    stop() {
        this.audioPayloadStream = [];
        this.audioStream.stop();
        this.transcribeStreamingClient.destroy();
        this.isStarted = false;
    }

    async *audioGenerator() {
        for await (const chunk of this.audioPayloadStream) {
            yield { AudioEvent: { AudioChunk: chunk } };
        }
    }
}

export const transcribeController = new TranscribeController();


And I tried to use in my component like this:

import { Button, View } from 'react-native';
import { styles } from './index';
import { Audio } from 'expo-av';
import { transcribeController } from '@/lib/aws/transcribe';
import React, { useEffect } from 'react';

export default function TranscribeComponent() {
    useEffect(() => {
        Audio.requestPermissionsAsync();
    }, []);

    const startTranscription = async () => {
        try {
            await transcribeController.init();
        } catch (error) {
            console.log('Error initializing transcription:', error);
        }
    };

    return (
        <View style={styles.container}>
            <Button title="Transcribe Sound" onPress={startTranscription} />
        </View>
    );
}


So I just want to know how could I implement it so it transcribe the audio from a file or from microphone, this current code is not working.

This is the error I get when pressing the button:

 LOG  [TypeError: Object is not async iterable]
 LOG  [TypeError: Object is not async iterable]

Dependencies for reference:

    yarn add @aws-sdk/client-transcribe-streaming aws-sdk react-native-get-random-values buffer react-native-url-polyfill react-native-live-audio-stream expo expo-av

Upvotes: 1

Views: 962

Answers (0)

Related Questions