Reputation: 113
One includes the TMC # for all the roadways of interest. The second includes travel times occurring on every single roadway in a particular state. I want to use the first table to filter so that only the records corresponding to those roadways of interest are remaining.
df
id link tmc
1 1 23402444 122P06466
2 2 23402487 122P06476
3 3 23402488 122N06476
4 4 23402493 122N06477
5 5 23402555 122P06454
6 6 23402557 122N06453
df2
id tmc epoch tt
1 1 108N04625 1182014 163
2 2 108N04625 1182014 103
3 3 108N04625 1182014 73
4 4 108N04625 1172014 254
5 5 108N04625 1172014 224
I was trying to use a filter
Data2Filter<-(Data2, TMC==Data2$TMC)
but I was getting either object not found for everything or that my dimensions are mismatched (there are about 8000 records in Data1 and 14000000 in Data 2 because there can be multiple travel times (TT) on a TMC but I'm only interested in those that occurred on TMCs corresponding to the list in Data1). I'm very familiar with MatLab but unfortunately at the moment I only have R Studio available and know nothing about this software. Also these table are loaded in as cvs files if that makes any difference.
Upvotes: 3
Views: 5708
Reputation: 5856
df2$tmc[1] <- df$tmc[1]
df2
id tmc epoch tt
1 1 122P06466 1182014 163
2 2 108N04625 1182014 103
3 3 108N04625 1182014 73
4 4 108N04625 1172014 254
5 5 108N04625 1172014 224
Many many options
subset(df2, tmc %in% df$tmc)
df2[df2$tmc %in% df$tmc, ]
library(dplyr)
fi <- filter(df2, tmc %in% df$tmc)
fi
id tmc epoch tt
1 1 122P06466 1182014 163
superfast with large datasets
library(data.table)
dt <- data.table(df)
dt2 <- data.table(df2)
subset(dt2, tmc %in% dt$tmc)
This may be a useful topic on subset performance
Upvotes: 3