Maheetha Bharadwaj
Maheetha Bharadwaj

Reputation: 53

Merge Function in R based on Two Columns?

I have two data frames like such.

Table1
Sample Name Position A B C 
1      J    23       x x x
1      K    24       y y y
1      L    25       z z z


Table2
Sample Name Position A B C 
2      M    33       x x x
2      N    34       y y y
2      O    35       z z z

I want the final table to look like this.

 FinalTable1
Sample Name Position A B C Sample Name Position A B C
1      J    23       x x x 2      J    23       Na Na Na
1      K    24       y y y 2      K    24       Na Na Na
1      L    25       z z z 2      L    25       Na Na Na 
Sample Name Position A B C Sample Name Position A B C
1      M    33       Na Na Na 2      M    33       0 0 0 
1      N    34       Na Na Na 2      N    34       0 0 0
1      O    35       Na Na Na 2      O    35       0 0 0

Basically, I want to merge two data frames, so the union of both data frames in terms of Names and Positions, is in the final table, and that the final data frame fills in nonexisting positions with Na, in samples that don't have those genes/positions, but matches samples that do (so if both table1 and table2 had a P and data for it, then they would be written side by side). Eventually, I want to merge 8 data tables like this. I've been trying to play with merge on R, but I can't seem to get it to work. Any guidance would be great.

Thanks!

Upvotes: 1

Views: 176

Answers (1)

akrun
akrun

Reputation: 887901

You could try

cbind(
   merge(tbl1, tbl2, by=c('Name', 'Position'), all.x=TRUE),
   merge(tbl1, tbl2, by=c('Name', 'Position'), all.y=TRUE))

If there are multiple datasets to be merged, you can put them in a list and then use Reduce along with merge. For example

lst <- list(tbl1, tbl2)
res1 <- Reduce(function(...) merge(..., by=c('Name', 'Position'), 
              all.x=TRUE), lst)

res2 <- Reduce(function(...) merge(..., by=c('Name', 'Position'), 
              all.y=TRUE), lst)
cbind(res1,res2)

Upvotes: 2

Related Questions