Reputation: 101
I am new here and have searched the forum for my problem, but did not find a solution. I have two data frames which I want to merge on a common key field.
merge(x
,y
,by.x="a"
,by.y="b"
,all.x=T
,sort=F
)
Since my x-Dataframe has more rows than my y-Dataframe, I want to keep all rows from x with NA for the column from y but with all values in the columns from x. This code is only giving me extra rows for the unmatched cases with NA in ALL columns (columns from x and y). I would be really grateful if someone could help me out? Where is my mistake?
Example:
a = data.frame(c(111,222,333,444),c(1,5,3,8))
b = data.frame(c(111,222),c(0.1,0.4))
colnames(a)=c("code","value")
colnames(b)=c("code","value")
c = merge(a
,b
,by="code"
,all.x=T)
In this example it is working properly. In my data I obtain NA in all columns in row 3&4.
I hope you can understand my lousy example?!
Thank you! Jessica ;)
Upvotes: 10
Views: 19558
Reputation: 1027
So curiously if you re-run the exact same code from @nograpes but you have the same column name in x
and y
, then you get NAs like you had in your example. See below
x <- data.frame(val1=c(2,8,6,3), a=c('h','k','b','e'))
y <- data.frame(val2=c(4,1), a=c('h','e'))
merge(x, y, by = 'a', all=TRUE)
Result from the console is
a val1 val2
b 6 NA
e 3 1
h 2 4
k 8 NA
Upvotes: 0
Reputation: 18323
Just set all=TRUE
.
# Create your data
x<-data.frame(val1=c(2,8,6,3),a=c('h','k','b','e'))
y<-data.frame(val2=c(4,1),b=c('h','e'))
# Outer join
merge(x,y,by.x='a',by.y='b',all=TRUE)
# a val1 val2
# 1 b 6 NA
# 2 e 3 1
# 3 h 2 4
# 4 k 8 NA
Upvotes: 2