Reputation: 181
I want to merge two tables and only keep similar Task
from two tables. Non-common Task
is removed. If two Task
is similar, I only keep smaller value,
Two tables like this
x<-data.frame("Task"=c("A","B","C","D","E"),"FC"=c(12,NA,15,14,NA),FH=c(13,15,NA,17,20))
Task FC FH
1 A 12 13
2 B NA 15
3 C 15 NA
4 D 14 17
5 E NA 20
y<-data.frame("Task"=c("B","C","F","G"),"FC"=c(NA,12,20,NA),FH=c(NA,17,18,NA))
Task FC FH
1 B NA NA
2 C 12 17
3 F 20 18
4 G NA NA
I want an output like this
Task FC FH
2 B NA 15
3 C 12 17
Upvotes: 1
Views: 321
Reputation: 6639
You can also do:
# Perform a join
merged <- merge(x = x, y = y, by = "Task")
# Get the minimum value out of two comparable columns
merged$FC <- with(merged, pmin(FC.x, FC.y, na.rm = TRUE))
merged$FH <- with(merged, pmin(FH.x, FH.y, na.rm = TRUE))
# Delete the unwanted columns appearing out of merge
merged <- merged[-c(2:5)]
Output:
Task FC FH
1 B NA 15
2 C 12 17
Upvotes: 0
Reputation: 39858
One dplyr
possibility could be:
x %>%
bind_rows(y) %>%
group_by(Task) %>%
filter(n() > 1) %>%
summarise_all(~ ifelse(all(is.na(.)), NA, min(., na.rm = TRUE)))
Task FC FH
<chr> <dbl> <dbl>
1 B NA 15
2 C 12 17
Or if there could be duplicate tasks per single df:
x %>%
bind_rows(y, .id = "ID") %>%
group_by(Task) %>%
filter(n() > 1 & n_distinct(ID) > 1) %>%
summarise_all(~ ifelse(all(is.na(.)), NA, min(., na.rm = TRUE))) %>%
select(-ID)
Upvotes: 5