NodeNewb
NodeNewb

Reputation: 71

Sending NAudio / Opus-encoded audio from device as RTP

First, I will apologize. I used to tinker with VB5 a LONG time ago and have been out of the programmer's chair for years - I'm still re-learning the basics and started learning C#/.NET recently. I'm also new to this site, as well, and ask for your patience and guidance. Enough backstory on me.

Using this wrapper for Opus, of which I added the the wrapper project to my own solution, and NAudio I believe I have it set up to actively grab the audio from my device (soundcard) and utilize the example encoder code to get encoded audio into the _playBuffer.

My next task is to get the encoded data from and send it using RDP so it can be sent for decoding in a client app on another machine where it will be decoded and played out of their sound device.

Am I correct in understanding that the data in the _playBuffer is ready-to-go encoded data? Or does this need split differently for RTP packets? (I see a uLAW example here, but am unsure if I can adapt to suit my needs. As the downloaded source code is commented in what appears to be German - yet I barely speak and write English as a first language - even those are not terribly helpful.)

(Am I even using the right terminology?) As of now, the stock code you see puts the _playBuffer data back out through a WaveOut as was his example - which I neglected to omit here and have left to explain my (probable lack of) understanding. (If it's "playable," it's "sendable.")

Another issue is that my intention was to multicast the stream for point-to-point over the internet - though I'm not sure multicast is what I want for that.

    using System;
    using System.Collections.Generic;
    using System.ComponentModel;
    using System.Data;
    using System.Drawing;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    using System.Windows.Forms;
    using NAudio;
    using NAudio.CoreAudioApi;
    using NAudio.Wave;
    using FragLabs.Audio.Codecs;

    namespace VUmeterappStereo
    {
        public partial class Form1 : Form
        {private void Form1_Load(object sender, EventArgs e)
        {
            for (int i = 0; i < WaveIn.DeviceCount; i++)
            {
                comboBox1.Items.Add(WaveIn.GetCapabilities(i).ProductName);
            }
            if (WaveIn.DeviceCount > 0)
                comboBox1.SelectedIndex = 0;
            for (int i = 0; i < WaveOut.DeviceCount; i++)
            {
                comboBox2.Items.Add(WaveOut.GetCapabilities(i).ProductName);
            }
            if (WaveOut.DeviceCount > 0)
                comboBox2.SelectedIndex = 0;
        }

        private void button1_Click(object sender, EventArgs e)
        {
            button2.Enabled = true;
            button1.Enabled = false;
            StartEncoding();
        }

        private void button2_Click(object sender, EventArgs e)
        {
            button1.Enabled = true;
            button2.Enabled = false;
            StopEncoding();
        }

        WaveIn _waveIn;
        WaveOut _waveOut;
        BufferedWaveProvider _playBuffer;
        OpusEncoder _encoder;
        OpusDecoder _decoder;
        int _segmentFrames;
        int _bytesPerSegment;
        ulong _bytesSent;
        DateTime _startTime;
        Timer _timer = null;

        void StartEncoding()
        {
            _startTime = DateTime.Now;
            _bytesSent = 0;
            _segmentFrames = 960;
            _encoder = OpusEncoder.Create(48000, 1, FragLabs.Audio.Codecs.Opus.Application.Voip);
            _encoder.Bitrate = 8192;
            _decoder = OpusDecoder.Create(48000, 1);
            _bytesPerSegment = _encoder.FrameByteCount(_segmentFrames);

            _waveIn = new WaveIn(WaveCallbackInfo.FunctionCallback());
            _waveIn.BufferMilliseconds = 50;
            _waveIn.DeviceNumber = comboBox1.SelectedIndex;
            _waveIn.DataAvailable += _waveIn_DataAvailable;
            _waveIn.WaveFormat = new WaveFormat(48000, 16, 1);

            _playBuffer = new BufferedWaveProvider(new WaveFormat(48000, 16, 1));

            _waveOut = new WaveOut(WaveCallbackInfo.FunctionCallback());
            _waveOut.DeviceNumber = comboBox2.SelectedIndex;
            _waveOut.Init(_playBuffer);

            _waveOut.Play();
            _waveIn.StartRecording();

            if (_timer == null)
            {
                _timer = new Timer();
                _timer.Interval = 1000;
                _timer.Tick += _timer_Tick;
            }
            _timer.Start();
        }

        void _timer_Tick(object sender, EventArgs e)
        {
            var timeDiff = DateTime.Now - _startTime;
            var bytesPerSecond = _bytesSent / timeDiff.TotalSeconds;
            Console.WriteLine("{0} Bps", bytesPerSecond);
        }

        byte[] _notEncodedBuffer = new byte[0];
        void _waveIn_DataAvailable(object sender, WaveInEventArgs e)
        {
            byte[] soundBuffer = new byte[e.BytesRecorded + _notEncodedBuffer.Length];
            for (int i = 0; i < _notEncodedBuffer.Length; i++)
                soundBuffer[i] = _notEncodedBuffer[i];
            for (int i = 0; i < e.BytesRecorded; i++)
                soundBuffer[i + _notEncodedBuffer.Length] = e.Buffer[i];

            int byteCap = _bytesPerSegment;
            int segmentCount = (int)Math.Floor((decimal)soundBuffer.Length / byteCap);
            int segmentsEnd = segmentCount * byteCap;
            int notEncodedCount = soundBuffer.Length - segmentsEnd;
            _notEncodedBuffer = new byte[notEncodedCount];
            for (int i = 0; i < notEncodedCount; i++)
            {
                _notEncodedBuffer[i] = soundBuffer[segmentsEnd + i];
            }

            for (int i = 0; i < segmentCount; i++)
            {
                byte[] segment = new byte[byteCap];
                for (int j = 0; j < segment.Length; j++)
                    segment[j] = soundBuffer[(i * byteCap) + j];
                int len;
                byte[] buff = _encoder.Encode(segment, segment.Length, out len);
                _bytesSent += (ulong)len;
                buff = _decoder.Decode(buff, len, out len);
                _playBuffer.AddSamples(buff, 0, len);
            }
        }

        void StopEncoding()
        {
            _timer.Stop();
            _waveIn.StopRecording();
            _waveIn.Dispose();
            _waveIn = null;
            _waveOut.Stop();
            _waveOut.Dispose();
            _waveOut = null;
            _playBuffer = null;
            _encoder.Dispose();
            _encoder = null;
            _decoder.Dispose();
            _decoder = null;

        }



        private void timer1_Tick(object sender, EventArgs e)
        {
            MMDeviceEnumerator de = new MMDeviceEnumerator();
            MMDevice device = de.GetDefaultAudioEndpoint(DataFlow.Render, Role.Multimedia);
            //float volume = (float)device.AudioMeterInformation.MasterPeakValue * 100;
            float volLeft = (float)device.AudioMeterInformation.PeakValues[0] * 100;
            float volRight = (float)device.AudioMeterInformation.PeakValues[1] * 100;
            progressBar1.Value = (int)volLeft;
            progressBar2.Value = (int)volRight;
        }

        private void timer2_Tick(object sender, EventArgs e)
        {

        }
    }
}

Thanks for anything you can contribute to help me understand about how to accomplish getting the data out via RTP stream.

Oh, and yes, this first started out with my tinkering to recreate a VU meter from a tutorial example - thus the namespace name and extra code, which does function.

Upvotes: 2

Views: 2238

Answers (1)

Max Healey
Max Healey

Reputation: 84

The code example encodes than decodes the audio. You will need to send the bytes contained in Buff to the network.

This section of code from the above example is receiving audio from the sound card.

    byte[] _notEncodedBuffer = new byte[0];
    void _waveIn_DataAvailable(object sender, WaveInEventArgs e)
    {
        byte[] soundBuffer = new byte[e.BytesRecorded + _notEncodedBuffer.Length];
        for (int i = 0; i < _notEncodedBuffer.Length; i++)
            soundBuffer[i] = _notEncodedBuffer[i];
        for (int i = 0; i < e.BytesRecorded; i++)
            soundBuffer[i + _notEncodedBuffer.Length] = e.Buffer[i];

        int byteCap = _bytesPerSegment;
        int segmentCount = (int)Math.Floor((decimal)soundBuffer.Length / byteCap);
        int segmentsEnd = segmentCount * byteCap;
        int notEncodedCount = soundBuffer.Length - segmentsEnd;
        _notEncodedBuffer = new byte[notEncodedCount];
        for (int i = 0; i < notEncodedCount; i++)
        {
            _notEncodedBuffer[i] = soundBuffer[segmentsEnd + i];
        }

        for (int i = 0; i < segmentCount; i++)
        {
            byte[] segment = new byte[byteCap];
            for (int j = 0; j < segment.Length; j++)
                segment[j] = soundBuffer[(i * byteCap) + j];
            int len;
            byte[] buff = _encoder.Encode(segment, segment.Length, out len);
            _bytesSent += (ulong)len;
            buff = _decoder.Decode(buff, len, out len);
            _playBuffer.AddSamples(buff, 0, len);
        }
    }

At this line

byte[] buff = _encoder.Encode(segment, segment.Length, out len);

it is at this point that you create your RTP packet

https://www.rfc-editor.org/rfc/rfc3550

and then use C# to send it on the network

Usually as UDP

Sending UDP Packet in C#

The remaining code belongs in the receiving app, following extraction of Buff from the RTP Packet.

buff = _decoder.Decode(buff, len, out len);
            _playBuffer.AddSamples(buff, 0, len);

Upvotes: 1

Related Questions