Victor Yan
Victor Yan

Reputation: 199

Audio transcription with Expo + google speech to text

I'm trying to record audio on Expo and get its transcription by using Google's Speech to Text Service.

It's working on iOS already, but not on Android yet. I think it's a problem with the recording options for Android.

I didn't receive error responses from Google server, only an empty object.

Here is the code:

const recordingOptions = {
  // android not currently in use, but parameters are required
  android: {
    extension: ".m4a",
    outputFormat: Audio.RECORDING_OPTION_ANDROID_OUTPUT_FORMAT_MPEG_4,
    audioEncoder: Audio.RECORDING_OPTION_ANDROID_AUDIO_ENCODER_AAC,
    sampleRate: 44100,
    numberOfChannels: 1,
    bitRate: 128000,
  },
  ios: {
    extension: ".wav",
    audioQuality: Audio.RECORDING_OPTION_IOS_AUDIO_QUALITY_HIGH,
    sampleRate: 44100,
    numberOfChannels: 1,
    bitRate: 128000,
    linearPCMBitDepth: 16,
    linearPCMIsBigEndian: false,
    linearPCMIsFloat: false,
  },
};

const [recording, setRecording] = useState<Audio.Recording | null>(null);

const startRecording = async () => {
    const { status } = await Permissions.askAsync(Permissions.AUDIO_RECORDING);
    if (status !== "granted") return;

    // some of these are not applicable, but are required
    await Audio.setAudioModeAsync({
      allowsRecordingIOS: true,
      interruptionModeIOS: Audio.INTERRUPTION_MODE_IOS_DO_NOT_MIX,
      playsInSilentModeIOS: true,
      shouldDuckAndroid: true,
      interruptionModeAndroid: Audio.INTERRUPTION_MODE_ANDROID_DO_NOT_MIX,
      playThroughEarpieceAndroid: true,
    });
    const newRecording = new Audio.Recording();
    try {
      await newRecording.prepareToRecordAsync(recordingOptions);
      await newRecording.startAsync();
    } catch (error) {
      console.log(error);
      stopRecording();
    }
    setRecording(newRecording);
  };

  const stopRecording = async () => {
    try {
      await recording!.stopAndUnloadAsync();
    } catch (error) {
      // Do nothing -- we are already unloaded.
    }
  };

const getAudioTranscription = async () => {
    try {
      const info = await FileSystem.getInfoAsync(recording!.getURI()!);
      console.log(`FILE INFO: ${JSON.stringify(info)}`);
      const uri = info.uri;

      await toDataUrl(uri, async function (base64content: string) {
        if (Platform.OS == "ios")
          base64content = base64content.replace("data:audio/vnd.wave;base64,", "");
        else
          base64content = base64content.replace("data:audio/aac;base64,", "");

        console.log(recording?._options?.android)
        
        const body = {
          audio: {
            content: base64content,
          },
          config: {
            enableAutomaticPunctuation: true,
            encoding: "LINEAR16",
            languageCode: "pt-BR",
            model: "default",
            sampleRateHertz: 44100,
          },
        };

        const transcriptResponse = await fetch(
          "https://speech.googleapis.com/v1p1beta1/speech:recognize?key=MY_KEY",
          { method: "POST", body: JSON.stringify(body) }
        );
        const data = await transcriptResponse.json();

        const userMessage = data.results && data.results[0].alternatives[0].transcript || "";
      });
    } catch (error) {
      console.log("There was an error", error);
    }
    stopRecording();
  };

Upvotes: 0

Views: 1303

Answers (2)

Sabaa Abdennabi
Sabaa Abdennabi

Reputation: 1

i faced the same issue and found a solution for it . I am using expo and google cloud speech to text service and this is the compatible recordingOptions that i found for recording in android .

// this is for the recordingOptions while creating the audio 
    android: {
      extension: ".webm",
      outputFormat: Audio.RECORDING_OPTION_ANDROID_OUTPUT_FORMAT_WEBM, 
      audioEncoder: Audio.RECORDING_OPTION_ANDROID_AUDIO_ENCODER_DEFAULT, 
      sampleRate: 16000, 
      numberOfChannels: 1, 
      bitRate: 64000,
    }, 
    // and for the request config u need to add this when calling the api 
    config: {
        encoding: "WEBM_OPUS",
        sampleRateHertz: 16000,
        languageCode: "fr-FR",
      },

this worked fine for me , and for the ios those options u provided work well , hope it works for you as well !

Upvotes: 0

Fitter Man
Fitter Man

Reputation: 682

This combination definitely works, although I had a lot of other issues getting Expo to behave prior to reaching this conclusion.

      extension: '.amr',
      outputFormat: Audio.RECORDING_OPTION_ANDROID_OUTPUT_FORMAT_AMR_WB,
      audioEncoder: Audio.RECORDING_OPTION_ANDROID_AUDIO_ENCODER_AMR_WB,
      sampleRate: 16000,
      numberOfChannels: 1,
      bitRate: 128000,

Upvotes: 1

Related Questions