Murph
Murph

Reputation: 41

Pyalsaaudio recording allowing for huge delay without overrunning the buffer

I want to record audio in realtime on Ubuntu and pyalsaaudio seems to work best for detecting my input devices correctly. I started off with the included recordtest.py script, and wanted to experiment with latency to see when the buffer would fill up and give me an error (or at least return -EPIPE) - as per the pyalsaaudio documentation for PCM.read():

In case of an overrun, this function will return a negative size: -EPIPE. This indicates that data was lost, even if the operation itself succeeded. Try using a larger periodsize.

However, a tiny buffer size wasn't causing problems, so to further investigate I added in huge time.sleep()'s in between calls to read() in recordtest.py:

inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, 
    channels=1, rate=44100, format=alsaaudio.PCM_FORMAT_S16_LE, 
    periodsize=160, device=device)

loops_with_data = 3000 #3000*160/44100 = 10.9 seconds of audio
first_time = True
while loops_with_data > 0:
    # Read data from device
    l, data = inp.read()
    print("l:",l)

    if l:
        f.write(data)
        if first_time:
            #big delay after first data read
            time.sleep(100)
            first_time = False
        else:
            #smaller delay otherwise, still longer than one period length
            time.sleep(.01)
        loops_with_data-=1

I would've expected this to overrun the buffer - however, the value of l returned by read() is never negative, and almost always 160. When I play back the audio, I get a perfect recording of the first 10.9 seconds of what I said into the microphone. Somehow it seems that the buffer being used is huge, storing over 100 seconds worth of audio so that when read() is called 100 seconds later, it can still access all the old periods of frames. The problem with this is that if my application runs a function in between calls to read() that take too long, the audio will keep getting more and more delayed and I'll be none the wiser, since nothing indicates that this is happening.

I've tried digging into alsaaudio.c, and have discovered some weirdness - no matter what I do,the PCM object always seems to think it has a buffer size of a reasonable number of frames (assuming frames = audio samples), but buffer time and number of periods per buffer always show up as 0. I've tried printing this using inp.info() in python, and printing in the c file itself. It's extra weird because the c file is clearly trying to set 4 periods per buffer using snd_pcm_hw_params_set_periods_near():

dir = 0;
unsigned int periods = 4;
snd_pcm_hw_params_set_periods_near(self->handle, hwparams, &periods, &dir);

But after the following line, periods gets set to 0:

/* Query current settings. These may differ from the requested values, 
which should therefore be synced with actual values */

snd_pcm_hw_params_current(self->handle, hwparams);

I've tried all sorts of other functions (like snd_pcm_hw_params_set_periods_min() and snd_pcm_hw_params_set_periods_max()) with no luck.

Upvotes: 1

Views: 275

Answers (1)

Ronald van Elburg
Ronald van Elburg

Reputation: 275

The function snd_pcm_drop allows you to drop the contents of the buffer. This function is already available from pyalsaaudio as the drop method for a PCM device.

After:

#big delay after first data read
            time.sleep(100)

you can simply add

            inp.drop()

All input that arrived before calling drop() will be ignored. (But there is still some sound from the start of the script in the scripts own data variable)

More subtle solutions seem possible, but would require adding snd_pcm_forward and perhaps snd_pcm_forwardable to the pyalsaaudio interface.

Here the complete modified script I used for analysis and testing. (I shortened the big delay to 4 seconds.) I also used soundfile for wav-file creation as audacity wasn't happy with the other method of creating wav-files.

import time
import alsaaudio
import numpy as np
import struct
import soundfile as sf

conversion_dicts = {
        alsaaudio.PCM_FORMAT_S16_LE: {'dtype': np.int16, 'endianness': '<', 'formatchar': 'h', 'bytewidth': 2},
}

def get_conversion_string(audioformat, noofsamples):
    conversion_dict = conversion_dicts[audioformat]
    conversion_string = f"{conversion_dict['endianness']}{noofsamples}{conversion_dict['formatchar']}"
    return conversion_string

device = 'default'
fs = 44100

inp = alsaaudio.PCM(alsaaudio.PCM_CAPTURE, alsaaudio.PCM_NONBLOCK, 
    channels=1, rate=fs, format=alsaaudio.PCM_FORMAT_S16_LE, 
    periodsize=160, device=device)

print(inp.info())

f = sf.SoundFile("test.wav", 'wb', samplerate=fs, channels=1)

dtype = np.int16 

loops_with_data = 3000 #3000*160/44100 = 10.9 seconds of audio
first_time = True

while loops_with_data > 0:
    # Read data from device
    l, rawdata = inp.read()
    
    conversion_string = get_conversion_string(alsaaudio.PCM_FORMAT_S16_LE, l)
    data = np.array(struct.unpack(conversion_string, rawdata), dtype=dtype)
    

    if l > 0:
        print(f"\r{loops_with_data:4}", end='')
        f.write(data)
        if first_time:
            #big delay after first data read
            time.sleep(4)
            inp.drop()
            first_time = False
        else:
            #smaller delay otherwise, still longer than one period length
            time.sleep(.01)
        loops_with_data-=1
    else:
        print(".", end='')
        
f.close()

Upvotes: 0

Related Questions