Reputation: 11
I am working with two tab-delimited files. One has labels with time stamps, and the other has time stamps with pitch tracking data. Samples of both below.
Label file (table A)
12011.151 12133.975 statusAE1
12168.452 12239.561 statusAH0
14500.971 14616.253 statusAE1
14649.114 14702.446 statusAH0
16827.322 16943.682 statusAE1
16978.159 17028.797 statusAH0
19632.974 19688.999 purposeER1
19787.582 19847.916 purposeAH0
21957.925 22028.293 purposeER1
The first column above is start time, in milliseconds, the second is the end time, and the third is the label of the defined region.
Pitch data (table B)
479.002 41.565
503.039 60.425
521.905 0.000
2161.905 171.387
2167.710 0.000
2175.147 143.646
2182.132 143.494
2188.844 143.646
2195.828 144.714
2202.812 144.806
2209.705 144.287
2216.599 143.433
2223.583 143.768
2230.476 144.043
2237.551 144.836
The first column is time in milliseconds, and the second is fundamental frequency (f0) in Hertz. I would like to write a script that will compare these tables and create a new table, such that any row in table B that is within a time sequence defined in table A will be listed with the following format:
time f0 label
I'm hoping to do this within R. I would also be willing to try python or MATLAB solutions.
Upvotes: 0
Views: 47
Reputation: 4024
Here is with a crossjoin
library(dplyr)
pitch %>%
merge(label) %>%
filter(start_time <= time & time <= end_time)
Upvotes: 1