Denis
Denis

Reputation: 12077

How do I append columns from a list of data frames?

Suppose I have the following list of dataframes:

lst <- list(
a=data.frame(key=c(1,2,3), val=c(10,20,30)), 
b=data.frame(key=c(1,2,3), val=c(100,200,300)), 
c=data.frame(key=c(1,2,3), val=c(1000,2000,3000)), 
d=data.frame(key=c(1,2), val=c(10000,20000)))

How do I create a data.frame as follows? (or similar):

  key val.a val.b val.c val.d
1   1    10   100  1000 10000
2   2    20   200  2000 20000
3   3    30   300  3000 NA

I tried to do it this way (but FAILED):

Browse[2]> Reduce(function(x,y) merge(x, y, by = 'key', all.x = T), lst)
  key val.x val.y val.x val.y
1   1    10   100  1000 10000
2   2    20   200  2000 20000
3   3    30   300  3000 NA
Warning message:
In merge.data.frame(x, y, by = "key", all.x = T) :
  column names ‘val.x’, ‘val.y’ are duplicated in the result

NOTE: I would prefer a base-R solution but am interested in other ways of doing this

Upvotes: 4

Views: 77

Answers (4)

Daniel O
Daniel O

Reputation: 4358

A more robust solution in Base R with edge cases shown. This solution searches for every unique key across the entire list and fills the tables with NA's where appropriate.

lst <- list(
  a=data.frame(key=c(1,2,3), val=c(10,20,30)), 
  b=data.frame(key=c(1,2,3), val=c(100,200,300)), 
  c=data.frame(key=c(1,2,3,4), val=c(1000,2000,3000,4000)), 
  d=data.frame(key=c(1,3), val=c(10000,30000)))

df <- data.frame(key = unique(unlist(sapply(1:length(lst), function(x) lst[[x]]$key))))
df[2:(length(lst)+1)] <- NA

for(i in 1:length(lst)){df[,(i+1)][sapply(lst[[i]]$key,match,df$key)] <- lst[[i]]$val}

colnames(df)[2:NCOL(df)] <- paste0("val.",names(lst))

Output

> df
  key val.a val.b val.c val.d
1   1    10   100  1000 10000
2   2    20   200  2000    NA
3   3    30   300  3000 30000
4   4    NA    NA  4000    NA

First attempt in Base R

  df <- data.frame(key = c(1,2,3))
  df <- cbind(df,sapply(1:length(lst), function(x) lst[[x]]$val))

and to auto name your columns

colnames(df)[2:NCOL(df)] <- paste0("val.",names(lst))

Upvotes: 1

akrun
akrun

Reputation: 886968

We can use inner_join with reduce after renaming the second column with the corresponding list name

library(purrr)
library(dplyr)
library(stringr)
imap(lst, ~ { nm <- .y
              .x %>% 
                rename_at(vars(val), ~ str_c(., ".", nm))}) %>% 
    reduce(full_join, by = 'key')
#    key val.a val.b val.c val.d
#1   1    10   100  1000 10000
#2   2    20   200  2000 20000
#3   3    30   300  3000 30000

Or in base R, we use Map to do the renaming of the column, then as in the OP's post, do the merge within Reduce

Reduce(function(...) merge(..., by = 'key', all = TRUE), 
       Map(function(x, y) setNames(x, c('key',
           paste0(names(x)[-1], ".", y))), lst, names(lst)))
#  key val.a val.b val.c val.d
#1   1    10   100  1000 10000
#2   2    20   200  2000 20000
#3   3    30   300  3000 30000

Upvotes: 4

massisenergy
massisenergy

Reputation: 1820

If lst is the starting list, then:

library(dplyr);

#binding columns and then dropping the extra columns
df <- bind_cols(lst$a, lst$b, lst$c, lst$d)  %>% 
      select(-c(3,5,7)); 

#Use whatever name you want for the columns:
colnames(df)[c(2:5)] <- c("valLis1","valLis2","valLis3","valLis4");

df;
  key valLis1 valLis2 valLis3 valLis4
1   1      10     100    1000   10000
2   2      20     200    2000   20000
3   3      30     300    3000   30000

Upvotes: 1

Luca Pontiggia
Luca Pontiggia

Reputation: 187

If you are not too fussed about the column names etc looking exactly they way you have it, you can use the following:

df_merged = do.call(cbind,lst) %>% 
  select(key = a.key, 2,4,6,8) 

with output:

   key a.val b.val c.val d.val
1     1    10   100  1000 10000
2     2    20   200  2000 20000
3     3    30   300  3000 30000

Upvotes: 2

Related Questions