Mitchell
Mitchell

Reputation: 45

Using R to match values in a common column for two dataframes and then writing across corresponding data

I have two dataframes.

The first (df1) has a column which records an old ID number for each row and a column with the corresponding new ID number. This is a larger dataset.

The second (df2) has a column with the old ID number for each row only. I would like to create a new column in the second dataframe that contains the corresponding new ID number found in df1.

Here is a dummy example of the datasets:

df1

OldID     NewID Numofsh Loc
ID10000   4853  158     Bath
ID10001   5091  43      York
ID10002   5205  12      Cambridge
ID10003   4897  6       London
ID10004   6488  8       Edinburgh

df2

OldID    CPH
ID10004  77/567/4433
ID10001  66/123/4567

and a dummy example of the final df2 that I would like to have

 OldID    CPH         NewID
ID10004  77/567/4433  6488
ID10001  66/123/4567  5091

Upvotes: 1

Views: 171

Answers (2)

markus
markus

Reputation: 26343

Use match to subset df1 and extract the values of 'NewID' using $.

df2$NewID <- df1[match(df2$OldID, df1$OldID), ]$NewID
df2
#    OldID         CPH NewID
#1 ID10004 77/567/4433  6488
#2 ID10001 66/123/4567  5091

data

df1 <- read.table(text = "OldID     NewID Numofsh Loc
ID10000   4853  158     Bath
ID10001   5091  43      York
ID10002   5205  12      Cambridge
ID10003   4897  6       London
ID10004   6488  8       Edinburgh", header = TRUE)

df2 <- read.table(text = "OldID    CPH
ID10004  77/567/4433
ID10001  66/123/4567", header = TRUE)

Upvotes: 1

Jan
Jan

Reputation: 43169

Using dplyr::left_join():

library(dplyr)
df3 <- df2 %>%
  left_join(df1, by = 'OldID') %>%
  select(-c(Numofsh, Loc))

Which yields

    OldID         CPH NewID
1 ID10004 77/567/4433  6488
2 ID10001 66/123/4567  5091

Upvotes: 0

Related Questions