Programming Noob
Programming Noob

Reputation: 1332

Combine dataframes only by mutual rownames

I want to combine about 20 dataframes, with different lengths of rows and columns, only by the mutual rownames. Any rows that are not shared for ALL dataframes are deleted. So for example on two dataframes:

     Patient1   Patient64   Patient472
ABC     28          38          0
XYZ     92          11          998
WWE      1          10          282
ICQ      0          76          56
SQL     22          1002        778

combine with

     Pat_9   Pat_1   Pat_111
ABC    65      44       874
CBA    3       311      998
WWE    2       1110     282
vVv    2       760      56
GHG    12      1200     778

The result would be

   Patient1 Patient64 Patient472  Pat_9  Pat_1  Pat_111
ABC    28      38        0         65     44      874
WWE    1       10       282         2     1110    282

I know how to use rbind and cbind but not for the purpose of joining according to shared rownames.

Upvotes: 0

Views: 79

Answers (3)

Roman
Roman

Reputation: 17668

you can try

merge(d1, d2, by = "row.names")
  Row.names Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
1       ABC       28        38          0    65    44     874
2       WWE        1        10        282     2  1110     282

for more than two you can use a tidyverse

library(tidyverse)
lst(d1, d2, d2) %>%
  map(rownames_to_column) %>% 
  reduce(inner_join, by="rowname")

Upvotes: 1

Quinten
Quinten

Reputation: 41563

You can first turn your rownames_to_column and use a inner_join and at last convert column_to_rownames back like this:

df1 <- read.table(text="     Patient1   Patient64   Patient472
ABC     28          38          0
XYZ     92          11          998
WWE      1          10          282
ICQ      0          76          56
SQL     22          1002        778", header = TRUE)

df2 <- read.table(text = "     Pat_9   Pat_1   Pat_111
ABC    65      44       874
CBA    3       311      998
WWE    2       1110     282
vVv    2       760      56
GHG    12      1200     778", header = TRUE)

library(dplyr) 
library(tibble)
df1 %>%
  rownames_to_column() %>%
  inner_join(df2 %>% rownames_to_column(), by = "rowname") %>%
  column_to_rownames()
#>     Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
#> ABC       28        38          0    65    44     874
#> WWE        1        10        282     2  1110     282

Created on 2022-07-20 by the reprex package (v2.0.1)

Option with list of dataframes:

dfs_list <- list(df1, df2)
transform(Reduce(merge, lapply(dfs_list, function(x) data.frame(x, rn = row.names(x)))), row.names=rn, rn=NULL)
#>     Patient1 Patient64 Patient472 Pat_9 Pat_1 Pat_111
#> ABC       28        38          0    65    44     874
#> WWE        1        10        282     2  1110     282

Created on 2022-07-20 by the reprex package (v2.0.1)

Upvotes: 1

Mohamed Desouky
Mohamed Desouky

Reputation: 4425

Try this considering change list arguments to df1 , df2 , df3 , ... , df20 your data.frames

l <- lapply(list(df1 , df2 ) , \(x) {x[["id"]] <- rownames(x) ; x})

Reduce(\(x,y) merge(x,y , by = "id") , l)

Upvotes: 2

Related Questions