Noam
Noam

Reputation: 5266

Combining WasapiLoopbackCapture with google Stream Recognition

I'm trying to write an app that will listen to my computer audio and transcribe it using Google Speach Recognition.

I've been able to record the system sound using WasapiLoopbackCapture and I've been able to use google streaming recognition api with test files, but I was not able to merge the two togther.

When I stream the audio from the WasapiLoopbackCapture to google it doesn't return any result.

I've based my code on the google code sample at: https://github.com/GoogleCloudPlatform/dotnet-docs-samples/blob/9588cee6d96bfe484c8e189e9ac2f6eaa3c3b002/speech/api/Recognize/InfiniteStreaming.cs#L225

private WaveInEvent StartListening()
    {
        var waveIn = new WaveInEvent
        {
            DeviceNumber = 0,
            WaveFormat = new WaveFormat(SampleRate, ChannelCount)
        };
        waveIn.DataAvailable += (sender, args) =>
        _microphoneBuffer.Add(ByteString.CopyFrom(args.Buffer, 0, args.BytesRecorded));
        waveIn.StartRecording();
        return waveIn;
    }

And adjusted it to use the WasapiLoopbackCapture:

        private IDisposable StartListening()
    {

        var waveIn = new WasapiLoopbackCapture();
        //var waveIn = new WaveInEvent
        //{
            
        //    DeviceNumber = 0,
        //    WaveFormat = new WaveFormat(SampleRate, ChannelCount)
        //};

        SampleRate = waveIn.WaveFormat.SampleRate;
        ChannelCount = waveIn.WaveFormat.Channels;
        BytesPerSecond = SampleRate * ChannelCount * BytesPerSample;

        Console.WriteLine(SampleRate);
        Console.WriteLine(BytesPerSecond);
        waveIn.DataAvailable += (sender, args) =>
        _microphoneBuffer.Add(ByteString.CopyFrom(args.Buffer, 0, args.BytesRecorded));
        waveIn.StartRecording();
        return waveIn;
    }

But it doesn't return any transcribed text.

I've saved the input stream to a file, and it played ok - so the sound is getting there, my guess is that the waveFormat that is received from the WasapiLoopback is not compatible with what google likes - I tried some conversion and couldn't get it to work.

I've reviewed the following topics on stack overflow, but still couldn't get it to work: Resampling WasapiLoopbackCapture Naudio - Convert 32 bit wav to 16 bit wav

And tried combining them both:

private IDisposable StartListening()
    {
        
        var waveIn = new WasapiLoopbackCapture();
        //var waveIn = new WaveInEvent
        //{
        //DeviceNumber = 0,
        //WaveFormat = new WaveFormat(SampleRate, ChannelCount)
        //};


        //  SampleRate = waveIn.WaveFormat.SampleRate;
        //   ChannelCount = waveIn.WaveFormat.Channels;
        //  BytesPerSecond = waveIn.WaveFormat.AverageBytesPerSecond;// SampleRate * ChannelCount * BytesPerSample;

        var target = new WaveFormat(SampleRate, 16, 1);
        var writer = new WaveFileWriter(@"c:\temp\xx.wav", waveIn.WaveFormat);

        Console.WriteLine(SampleRate);
        Console.WriteLine(BytesPerSecond);
        var stop = false;
        waveIn.DataAvailable += (sender, args) =>
        {
            var a = args;
            byte[] newArray16Bit = new byte[args.BytesRecorded / 2];
            short two;
            float value;
            for (int i = 0, j = 0; i < args.BytesRecorded; i += 4, j += 2)
            {
                value = (BitConverter.ToSingle(args.Buffer, i));
                two = (short)(value * short.MaxValue);

                newArray16Bit[j] = (byte)(two & 0xFF);
                newArray16Bit[j + 1] = (byte)((two >> 8) & 0xFF);
            }
            var resampleStream = new NAudio.Wave.Compression.AcmStream(new WaveFormat(waveIn.WaveFormat.SampleRate
                ,16,waveIn.WaveFormat.Channels), target);
            Buffer.BlockCopy(newArray16Bit, 0, resampleStream.SourceBuffer, 0, a.BytesRecorded/2);
            int sourceBytesConverted = 0;
            var bytes = resampleStream.Convert(a.BytesRecorded/2, out sourceBytesConverted);
            var converted = new byte[bytes];
            Buffer.BlockCopy(resampleStream.DestBuffer, 9, converted,0, bytes);
            a = new WaveInEventArgs(converted,bytes);



            _microphoneBuffer.Add(ByteString.CopyFrom(a.Buffer, 0, a.BytesRecorded));
            if (writer != null)
            {
                writer.Write(a.Buffer, 0, a.BytesRecorded);
                if (writer.Position > waveIn.WaveFormat.AverageBytesPerSecond * 5)
                {
                    stop = true;
                    writer.Dispose();
                    writer = null;
                    Console.WriteLine("Saved file");
                }
            }
        };
        waveIn.StartRecording();
        return waveIn;
    }

But it doesn't work.

I'm not sure if this is the right path.

A code sample of a fix would be highly appreciated

I tried converting the bit rate etc.. but couldn't get this to work.

Upvotes: 0

Views: 178

Answers (0)

Related Questions