hitesh
hitesh

Reputation: 61

R-Lookup matching values in another dataframe and then merge them in original by creating new variables

race <- data.frame (id  = c('xxxxx','yyyyy','zzzzz','ppppp','qqqqq','rrrrr'),
                  height = c(180,195,165,172,170,181),
                  weight = c(75,80,60,75,75,80),
                  bodytype  = c('M','L','M','S','M','L'),
                  country = c('US','CA','MX','MX','AG','US'),
                  speed = c(100,120,110,95,100,120),
                  best_id = c('aaaaa','bbbbb','ccccc','ccccc','ddddd','aaaaa')
)

race_best <- data.frame (id  = c('aaaaa','bbbbb','ccccc','ddddd','eeeee','fffff'),
                         height = c(185,195,180,175,182,180),
                         weight = c(72,79,70,65,68,71),
                         bodytype  = c('M','M','M','S','L','L')

)


race_updated <-  data.frame (id  = c('xxxxx','yyyyy','zzzzz','ppppp','qqqqq','rrrrr'),
                             height = c(180,195,165,172,170,181),
                             weight = c(75,80,60,75,75,80),
                             bodytype  = c('M','L','M','S','M','L'),
                             country = c('US','CA','MX','MX','AG','US'),
                             speed = c(100,120,110,95,100,120),
                             best_id = c('aaaaa','bbbbb','ccccc','ccccc','ddddd','aaaaa'),
                             best_id_height = c(185,195,180,180,175,185),
                             best_id_weight = c(72,79,70,70,65,72),
                             best_id_bodytype  = c('M','M','M','M','S','M')

)

I have a dataframe named "race" in which I have few variables which describe the characteristics of the specific racer(height,weight etc.). Variable id is the unique identifier of the racer. There is also a variable called best_id which is the id of the racer with the previous best speed(in the category in which the current racer is). To explain better I have added another two datasets. dataset race is original dataset race_best is the dataset for best racers dataset race_updated is what I want to achieve. Original dataset(racer) and new variables which define the characteristics of the best racer. e.g. best_id_height is the height of the racer corresponding to the best_id and so on. It would be great if someone can help me with the problem.

Upvotes: 0

Views: 67

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

An option using lookup join in data.table so that your new columns are also named accordingly:

library(data.table)
setDT(race)[setDT(race_best), on=.(best_id=id), 
    paste0("best_", names(race_best)) := mget(paste0("i.", names(race_best)))]
race

Upvotes: 1

jandraor
jandraor

Reputation: 349

library(dplyr)

race_updated <- left_join(race, race_best, by = c("best_id" = "id") )

Upvotes: 2

Related Questions