Copying over the values of a column, based on the values of another column

Question

I have a dataset, "data" where every few adjacent rows has the same "id", a parameter in the dataset. I have another dataset, "otherData", with the same id values, but only one row per id value. The order of the id's are the same in both datasets.

I want to copy a column "newColumn" from otherData to data, mapping it based on the id value. This is what I currently have:

sapply(otherData$id, function(id)
    data$newColumn[data$id == id, ] <- otherData[otherData$id == id, ]$newColumn
)

However, sapply is disgustingly slow, since it iterates over each unique id value and has to find every row with that value in the data.frame.

Is there a faster alternative? I thought that perhaps I may be able to take advantage of the fact that all the rows with the same "id" value are adjacent to each other.

Parfait · Accepted Answer

Simply merge the two dataframes but filter columns in otherData to join id key and newColumn:

newData <- merge(data, otherData[c("id", "newColumn")], by=c("id"))

Copying over the values of a column, based on the values of another column

Answers (1)

Related Questions