Reputation: 1023
I have a dataset, "data" where every few adjacent rows has the same "id", a parameter in the dataset. I have another dataset, "otherData", with the same id values, but only one row per id value. The order of the id's are the same in both datasets.
I want to copy a column "newColumn" from otherData to data, mapping it based on the id value. This is what I currently have:
sapply(otherData$id, function(id)
data$newColumn[data$id == id, ] <- otherData[otherData$id == id, ]$newColumn
)
However, sapply is disgustingly slow, since it iterates over each unique id value and has to find every row with that value in the data.frame.
Is there a faster alternative? I thought that perhaps I may be able to take advantage of the fact that all the rows with the same "id" value are adjacent to each other.
Upvotes: 0
Views: 101
Reputation: 107767
Simply merge the two dataframes but filter columns in otherData
to join id key and newColumn:
newData <- merge(data, otherData[c("id", "newColumn")], by=c("id"))
Upvotes: 1