Alyssa
Alyssa

Reputation: 1

Merge Function in R

I am using merge() function in R. I want to merge two dataframes. However, I am interested in the observations that don't merge. Is there a way to produce output that shows the observations that did not successfully merge?

Example: DataframeA=1138888 obs DataframeB=1010651 obs I want to see which obs were not included in DataframeB that were included in DataFrameA.

I have read through the merge() documentation and saw all.x, all.y commands. However, I really want a new dataframe that represents those observations that would NOT merge.

Thank you an advance for any help.

Upvotes: 0

Views: 762

Answers (1)

Rentrop
Rentrop

Reputation: 21497

Dont know how computationally effectiv the following solution is but it works with subset()

df.a<-data.frame(key=sample(1:100,50),"a")
df.b<-data.frame(key=sample(1:100,50),"b")
db.bNOTa<-subset(df.b,!(df.b$key %in% df.a$key))

Edit: Another solution using match would be:

db.bNOTa<-merge(df.b,df.a,all.x=TRUE)
db.bNOTa<-subset(db.bNOTa,!is.na(db.bNOTa[,3]))

In MySQL this would be a Left Join like Pointed out here: SQL JOIN To Find Records That Don't Have a Matching Record With a Specific Value

Upvotes: 1

Related Questions