Haribo
Haribo

Reputation: 2226

Merge list of dataframe in R

I have a list of dataframe in R like :

w = list(structure(list(var = structure(c(1L, 1L, 2L, 3L), .Label = c("A", 
"B", "C"), class = "factor"), val = 1:4), class = "data.frame", row.names = c(NA, 
-4L)), structure(list(var = structure(c(1L, 2L, 3L, 1L), .Label = c("A", 
"B", "C"), class = "factor"), val = 101:104), class = "data.frame", row.names = c(NA, 
-4L)))

I would like to merge those dataframe by var. trying :

Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "var", all.x = T),w)

  var val.x val.y
1   A     1   101
2   A     1   104
3   A     2   101
4   A     2   104
5   B     3   102
6   C     4   103

But this is not what I'm looking for ! I would like to have the outcome as :

 var val val.x
  A   1   101
  A   2   104
  B   3   102
  C   4   103

Upvotes: 1

Views: 296

Answers (2)

s_baldur
s_baldur

Reputation: 33498

You are implicitly joining by the row id within each group. It would make things easier to make that an explicit variable.

An easy way to create that variable is data.table::rowid():

w <- lapply(w, function(x) {x$id <- data.table::rowid(x$var); x})
Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = c("var", "id"), all.x = T), w)
  var id val.x val.y
1   A  1     1   101
2   A  2     2   104
3   B  1     3   102
4   C  1     4   103

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

You can create a seperate id column in each list and then merge them together.

Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = c("var", "id"), all.x = TRUE),
   lapply(w, function(x) transform(x, id = ave(val, var, FUN = seq_along))))


#  var id val.x val.y
#1   A  1     1   101
#2   A  2     2   104
#3   B  1     3   102
#4   C  1     4   103

In tidyverse the same logic can be applied using :

library(dplyr)
library(purrr)

map(w, ~.x %>% group_by(var) %>% mutate(id = row_number())) %>%
    reduce(left_join, by = c("var", "id"))

Upvotes: 1

Related Questions