Reputation: 783
I have two dataframes
Df1
Col1 Name Col3
ALL1 10 OP
ALL2 20 IO
ALL2 20 YU
ALL2 20 RT
ALL3 20 YU
ALL3 90 YU
ALL4 98 IU
DF2
Col1bis Col2
ALL1 OKI
ALL2 OP
ALL3 I
ALL4 LOP
The expected result is:
DF2
Col1bis Col2 Name
ALL1 OKI 10
ALL2 OP 20
ALL3 I 90
ALL4 LOP 98
and I would like to add the column Name
to the df2 to which correspond to the same pattern in Col1 and Col1bis.
Upvotes: 0
Views: 77
Reputation: 4169
What you need to do is merge or "join" your datasets, here are three of the most common approaches:l. First here's the data I test this with.
# Dummy data
df1 <- data.frame(
V1 = letters[1:10],
V2 = rnorm(10, 0, 1))
df2 <- data.frame(
V1 = letters[c(1,4,6)],
V3 = rnorm(3, 0, 1))
Approach one is base R using the merge()
function which is given the two data frames and a variable by which it should join the data.
# Base R
merge(df1, df2, by = "V1")
The next is using the tidyverse packages and a pipe, %>%
, select df1 first and then using the right_join()
function, one of many different join functions in those packages.
# Tidyverse
library(tidyverse)
df1 %>% right_join(df2, by = "V1")
The final option is the one I'd use, from the data.table package. The joining variable(s) here is given by on = ...
.
# Data.table
library(data.table)
# - convert to data.tables
setDT(df1)
setDT(df2)
# - join
df1[df2, on = "V1"]
Note that if you built/read your data in as a data.table (using data.table()
and fread()
respectively) you could skip the convert step.
Upvotes: 1
Reputation: 887951
We can use a join
library(data.table)
setDT(DF2)[DF1, Name := Name, on = .(Col1bis = Col1)]
DF2
# Col1bis Col2 Name
#1: ALL1 OKI 10
#2: ALL2 OP 20
#3: ALL3 I 90
#4: ALL4 LOP 98
Or in base R
with match
DF2$Name <- DF1$Name[match(DF2$Col1bis, DF1$Col1)]
DF1 <- structure(list(Col1 = c("ALL1", "ALL2", "ALL2", "ALL2", "ALL3",
"ALL3", "ALL4"), Name = c(10L, 20L, 20L, 20L, 20L, 90L, 98L),
Col3 = c("OP", "IO", "YU", "RT", "YU", "YU", "IU")), class = "data.frame", row.names = c(NA,
-7L))
DF2 <- structure(list(Col1bis = c("ALL1", "ALL2", "ALL3", "ALL4"), Col2 = c("OKI",
"OP", "I", "LOP")), class = "data.frame", row.names = c(NA, -4L
))
Upvotes: 1