How segment an audio file of keyboard keystrokes in order to isolate each keystroke sound

I'm working on a project where I'm trying to recognize keystrokes on a keyboard from audio recordings. My current approach involves segmenting the audio into segments corresponding to each keystroke sound. I've tried using a trigger to detect the sound of each keystroke, but this method isn't reliable every time. I'm therefore seeking suggestions on the best way to isolate each keystroke sound in an audio file so that I can label my data and use it in my AI model for keystroke recognition.

Any help would be greatly appreciated. Thank you!

I tried to segment with a split.py script found on the github of Charles Grassin : https://github.com/CGrassin/keyboard_audio_hack.git where he uses :

def split_file(file, time_before=0.025, time_after=0.2, trigger=1000, normalize_result=False):
    outputs = []

    # Open file
    sample_rate, a = read(file)
    a = np.array(a,dtype=float)
    
    # Compute params
    length_after = (int)(time_after*sample_rate)
    length_before = (int)(time_before*sample_rate)
    
    # Display sound (debug)
    #plt.plot(a)
    #plt.show()
    
    i = 0
    while i < a.size :
        # End of usable recording
        if(i+length_after > a.size):
            break;
        if (a[i] > trigger and i >= length_before):
            sub = a[i-length_before:i+length_after]
            if(normalize_result): sub = normalize(sub)
            outputs.append(sub)
            i += length_after
        i += 1
    
    return outputs, sample_rate;

Upvotes: 0

Views: 52

Answers (0)

Related Questions