Reputation: 23
I tried to merge two data.frames like the ones in the code. The main problem is that I have several rows with the same ID, but I would like to merge all the values (phase) with the ID in df1. I tried to search for similar questions but I couldn't manage to find any.
df1<-as.data.frame(c("a","a","a","a","a","c","c","c","b","b"))
colnames(df1)<-c("ID")
df2<-data.frame(c("a","a","a","a","a","b","b"),c(1,1,0,0,1,1,-1))
colnames(df2)<-c("ID","phase")
output<-cbind(c("a","a","a","a","a","c","c","c","b","b"),c(1,1,0,0,1,NA,NA,NA,1,-1))
I tried to use merge() but it resulted in a much bigger data.frame than the expected output. Moreover I also lost the NAs supposed to be merged with "c".
merge_out<-merge(df1,df2[,c("ID","phase")],by="ID")
ID phase
a 1
a 1
a 0
a 0
a 1
a 1
a 1
a 0
a 0
a 1
a 1
a 1
a 0
a 0
a 1
a 1
a 1
a 0
a 0
a 1
a 1
a 1
a 0
a 0
a 1
b 1
b -1
b 1
b -1
Any ideas? Thanks!
Upvotes: 2
Views: 896
Reputation: 39647
Maybe you are looking for pmatch
.
cbind(df1, phase=df2$phase[pmatch(df1$ID, df2$ID)])
#cbind(df1, df2[pmatch(df1$ID, df2$ID), "phase", drop = FALSE]) #Alternative
# ID phase
#1 a 1
#2 a 1
#3 a 0
#4 a 0
#5 a 1
#6 c NA
#7 c NA
#8 c NA
#9 b 1
#10 b -1
Upvotes: 1
Reputation: 11584
Does this work:
library(dplyr)
> df1 %>% group_by(ID) %>% mutate(uid = paste0(row_number(), ID)) %>% left_join(
+ df2%>% group_by(ID) %>% mutate(uid = paste0(row_number(), ID))
+ ) %>% select(-uid)
Joining, by = c("ID", "uid")
# A tibble: 10 x 2
# Groups: ID [3]
ID phase
<chr> <dbl>
1 a 1
2 a 1
3 a 0
4 a 0
5 a 1
6 c NA
7 c NA
8 c NA
9 b 1
10 b -1
Upvotes: 1