Reputation: 1
I'm working on a project where I'm trying to recognize keystrokes on a keyboard from audio recordings. My current approach involves segmenting the audio into segments corresponding to each keystroke sound. I've tried using a trigger to detect the sound of each keystroke, but this method isn't reliable every time. I'm therefore seeking suggestions on the best way to isolate each keystroke sound in an audio file so that I can label my data and use it in my AI model for keystroke recognition.
Any help would be greatly appreciated. Thank you!
I tried to segment with a split.py script found on the github of Charles Grassin : https://github.com/CGrassin/keyboard_audio_hack.git where he uses :
def split_file(file, time_before=0.025, time_after=0.2, trigger=1000, normalize_result=False):
outputs = []
# Open file
sample_rate, a = read(file)
a = np.array(a,dtype=float)
# Compute params
length_after = (int)(time_after*sample_rate)
length_before = (int)(time_before*sample_rate)
# Display sound (debug)
#plt.plot(a)
#plt.show()
i = 0
while i < a.size :
# End of usable recording
if(i+length_after > a.size):
break;
if (a[i] > trigger and i >= length_before):
sub = a[i-length_before:i+length_after]
if(normalize_result): sub = normalize(sub)
outputs.append(sub)
i += length_after
i += 1
return outputs, sample_rate;
Upvotes: 0
Views: 52