Reputation: 21
I want to process streaming audio (coming in from a person speaking on the peer of a webRTC peer connection) to detect when the person is done talking. I have got the audio track and access to individual frames. I see that each frame can be converted to an nd_array using Frame.to_ndarray. I can also see values in the ndarray changing depending on what the person is speaking, what pitch, what volume etc. Now, I want to detect silence on the stream. My question is what is in the ndarray and how can I make sense of the data?
while True:
try:
frame:AudioFrame = await track.recv()
frame_nd_array = frame.to_ndarray()
Where can I learn what is in the frame_nd_array?
Upvotes: 2
Views: 170