Reputation: 149
In my current project of a local VoIP app, I have to stream audio from one device and play it as fast as possible on another device connected to the same network using UDP. So I made a very basic demo android app using AudioRecord
and DatagramSocket
and a very simple C++ program to play the received audio using a fairly small circular buffer.
An important point is that the minimum delay seems to be decided by AudioRecord.getMinBufferSize()
which resulted in my configuration (48kHz 16 channels mono PCM on a real Android phone) to be 3840 bytes = 1920 samples = 40ms of audio.
while (isRecording == true) {
// read the actual amount of bytes inside the buffer
int read = mAudioRecord.read(audioBuffer, 0, minBufferSize);
// prepare the UDP packet
packet = new DatagramPacket(audioBuffer, read, address, port);
// send the packet
mDatagramSocket.send(packet);
}
I managed to achieve near real-time audio (under 50ms latency), but after a couple of testing, I came to the conclusion that playing the AudioRecord
's buffers as fast as possible is not an optimal solution as it results in gaps in the audio (underflows) as well as clicks and audio being cut (overflows/overwrite of audio), and after some investigation the culprit was simply the unstable ping between the two devices, and even a small lag could result in a terrible audio distorsion as shown below:
And the only solution I found is to manually delay the playback of EVERY packet by 10ms, solving the problem in the precedent scenario but increasing the overall latency to 40 + 10 = 50ms, and if I need to keep good audio with up to 20ms ping spikes, I would have to increase the latency to 60 ms and so on, resulting in a delayed audio that will be even mode delayed if redirected and sent online which is not really acceptable for VoIP (trying to keep it below the 150ms bar).
So I thought the perfect solution would be to reduce the amount of time it takes to record a single packet so I could add on top of it the wanted latency (e.g. if each audio buffer had only 20ms of audio I could add up to 30ms of delay to each playback and still keep it only 50ms delayed which is pretty good for VoIP).
But I'm not sure if this is possible, I wonder if there is a tricky way of achieving that. I noticed that AudioRecord.getMinBufferSize()
on the Android Studio Emulator (on Windows 10) gives 640 bytes (~7 ms) with the same PCM configuration which is an amazing number, but on the Genymotion emulator (on Debian) the minimum buffer size is 4480 bytes (~47 ms).
Upvotes: 1
Views: 1336
Reputation: 21640
The solution is to read smaller amounts of data from the AudioRecord
and send them as soon as possible to the server.
The AudioRecord.getMinBufferSize()
JavaDoc states:
Returns the minimum buffer size required for the successful creation of an AudioRecord object, in byte units. Note that this size doesn't guarantee a smooth recording under load, and higher values should be chosen according to the expected frequency at which the AudioRecord instance will be polled for new data.
So this is the minimum size for the buffer that AudioRecord
should allocate. Depending on how often you can fetch data from the AudioRecord
instance it might even be required to specify a bigger buffer.
The AudioRecord
JavaDoc in turn states (emphasis added):
Upon creation, an AudioRecord object initializes its associated audio buffer that it will fill with the new audio data. The size of this buffer, specified during the construction, determines how long an AudioRecord can record before "over-running" data that has not been read yet. Data should be read from the audio hardware in chunks of sizes inferior to the total recording buffer size.
So the documentation explicitly tells you not to try to read the whole buffer at once!
You can for example setup the AudioRecord
object as
mAudioRecord = new AudioRecord(
MediaRecorder.AudioSource.MIC,
48000, // sampeRate
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT,
9600 // buffer size: 100ms
);
and the polling code can read 480 samples every 10ms and send them to the server:
final int BUFFER_SIZE = 960; // 1 package every 10ms
byte[] audioBuffer = new byte[BUFFER_SIZE];
while (isRecording) {
// read the actual amount of bytes inside the buffer
int read = mAudioRecord.read(audioBuffer, 0, BUFFER_SIZE);
// prepare the UDP packet
packet = new DatagramPacket(audioBuffer, read, address, port);
// send the packet
mDatagramSocket.send(packet);
}
Upvotes: 2