Reputation: 135
I've 2 dataframes.
Dataframe A with millions of values like this:
ID_CLI | CHURN |
---|---|
34 | 1 |
16 | 1 |
19 | 1 |
23 | 1 |
45 | 1 |
Dataframe B with thousands of values like this:
ID_CLI | CHURN |
---|---|
23 | 0 |
45 | 0 |
19 | 0 |
and I want this type of output:
ID_CLI | CHURN |
---|---|
34 | 1 |
16 | 1 |
19 | 0 |
23 | 0 |
45 | 0 |
so I want to update dataframe A taking the value from dataframe B (in R).
Upvotes: 2
Views: 53
Reputation: 389275
You can do a full_join
and use coalesce
.
library(dplyr)
res <- full_join(df1, df2, by = 'ID_CLI') %>%
transmute(ID_CLI,
CHURN = coalesce(CHURN.y, CHURN.x))
# ID_CLI CHURN
#1 34 1
#2 16 1
#3 19 0
#4 23 0
#5 45 0
In base R -
res <- transform(merge(df1, df2, by = 'ID_CLI', all = TRUE),
CHURN = ifelse(is.na(CHURN.y), CHURN.x, CHURN.y))[names(df1)]
Upvotes: 2
Reputation: 21938
A merely tidyverse
solution:
library(dplyr)
df1 %>%
left_join(df2, by = "ID_CLI") %>%
mutate(across(ends_with(".x"), ~ ifelse(!is.na(get(sub(".x", ".y", cur_column()))),
get(sub(".x", ".y", cur_column())), .x))) %>%
select(!ends_with(".y")) %>%
rename_with(~ sub(".x", "", .), ends_with(".x"))
ID_CLI CHURN
1 34 1
2 16 1
3 19 0
4 23 0
5 45 0
6 19 0
Upvotes: 2
Reputation: 887901
If we have a big dataset, data.table
, join update would be fast i.e. join on
by the 'ID_CLI' column and assign (:=
) the CHURN from second data (i.CHURN
) to the first data column 'CHURN'
library(data.table)
setDT(df1)[df2, CHURN := i.CHURN, on = .(ID_CLI)]
-output
df1
ID_CLI CHURN
1: 34 1
2: 16 1
3: 19 0
4: 23 0
5: 45 0
df1 <- structure(list(ID_CLI = c(34L, 16L, 19L, 23L, 45L), CHURN = c(1L,
1L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, -5L
))
df2 <- structure(list(ID_CLI = c(23L, 45L, 19L), CHURN = c(0L, 0L, 0L
)), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 3