Reputation: 599
I have a dataframe df1 as below -
ID Count
11 345
22 456
33 786
44 765
55 NA
66 888
77 NA
Then i have a df2 as below -
ID Count
11 536
22 654
33 786
44 999
55 890
66 111
77 654
I want NA
values in df1
to be replaced by values from df2
for those specific id.
The resulting df3
should be:
ID Count
11 345
22 456
33 786
44 765
55 890
66 888
77 654
Any help will be great Thanks
Upvotes: 6
Views: 5952
Reputation: 11
To do that for multiple columns and if NAs happen in all these columns of specific rows, you can simply filter and join dataframes:
missing_id <- df1 %>% filter(is.na(Count)) %>% pull(ID) #Count = one of the columns with missing data
df1 <- df1 %>% filter(!is.na(Count)) %>%
full_join(df2 %>% filter(ID %in% missing_id)) %>%
arrange(ID)
Upvotes: 0
Reputation: 20095
Use of dplyr::coalesce
will make task simpler.
library(dplyr)
df1 %>% inner_join(df2, by= "ID") %>%
mutate(Count = coalesce(Count.x, Count.y)) %>%
select(ID, Count)
# ID Count
# 1 11 345
# 2 22 456
# 3 33 786
# 4 44 765
# 5 55 890
# 6 66 888
# 7 77 654
Data:
df1 <- read.table(text =
"ID Count
11 345
22 456
33 786
44 765
55 NA
66 888
77 NA",
header = TRUE)
df2 <- read.table(text =
"ID Count
11 536
22 654
33 786
44 999
55 890
66 111
77 654",
header = TRUE)
Upvotes: 5
Reputation: 66819
You can use an update join to edit those rows in the first table:
library(data.table)
setDT(DF1); setDT(DF2)
DF1[is.na(Count), Count := DF2[.SD, on=.(ID), x.Count]]
How it works
DF[i, j]
filters by i
then does j
j
, .SD
refers to the subset of data after filtering with i
j
with :=
adds or modifies columns in the tablex[i, on=, x.v]
is as described in this similar answerUpvotes: 2
Reputation: 43169
Using dplyr
:
library(dplyr)
df1 %>%
left_join(df2, by = c("ID")) %>%
mutate(Count = ifelse(is.na(Count.x), Count.y, Count.x)) %>%
select(-c(Count.x, Count.y))
This yields
ID Count
1 11 345
2 22 456
3 33 786
4 44 765
5 55 890
6 66 888
7 77 654
Upvotes: 0