Hakki
Hakki

Reputation: 1472

R list of data frames to data frame

I have real trouble understanding transformations between lists and data frame. I will lay out simple example what I'm trying to achieve. I have list of data frames, which are snapshots of points in time.

my_list <- list(data.frame(a = 1:5, b = c("a", "b", "c", "d", "f"), c= c("2015-01-01", "2015-01-01", "2015-01-01", "2015-01-01", "2015-01-01")), 
                data.frame(a = 1:5, b = c("a", "b", "c", "d", "f"), c= c("2015-02-01", "2015-02-01", "2015-02-01", "2015-02-01", "2015-02-01")))

I would like to transform this to following data frame.

           a b c d f
2015-01-01 1 2 3 4 5
2015-02-01 1 2 3 4 5

Would appreciate, if there is solution to reverse this as well, so go from data frame back to original list.

Upvotes: 1

Views: 240

Answers (1)

Uwe
Uwe

Reputation: 42544

You can try dcast().

But first, the list of data frames needs to be combined into one using rbindlist() from package data.table.

library(data.table)
dt_long <- rbindlist(my_list)
dt_long
#    a b          c
# 1: 1 a 2015-01-01
# 2: 2 b 2015-01-01
# 3: 3 c 2015-01-01
# 4: 4 d 2015-01-01
# 5: 5 f 2015-01-01
# 6: 1 a 2015-02-01
# 7: 2 b 2015-02-01
# 8: 3 c 2015-02-01
# 9: 4 d 2015-02-01
#10: 5 f 2015-02-01

dt_wide <- dcast(dt_long, c ~ b, value.var = "a")
dt_wide
#            c a b c d f
#1: 2015-01-01 1 2 3 4 5
#2: 2015-02-01 1 2 3 4 5

The reverse operation is melt():

melt(dt_wide, id.vars = "c", variable.name = "b", value.name = "a")
#             c b a
# 1: 2015-01-01 a 1
# 2: 2015-02-01 a 1
# 3: 2015-01-01 b 2
# 4: 2015-02-01 b 2
# 5: 2015-01-01 c 3
# 6: 2015-02-01 c 3
# 7: 2015-01-01 d 4
# 8: 2015-02-01 d 4
# 9: 2015-01-01 f 5
#10: 2015-02-01 f 5

Although still combined in one data.table, the column and row ordered can be adjusted as follows:

setcolorder(dt_long2, letters[1:3])
dt_long2[order(c, a)]
#    a b          c
# 1: 1 a 2015-01-01
# 2: 2 b 2015-01-01
# 3: 3 c 2015-01-01
# 4: 4 d 2015-01-01
# 5: 5 f 2015-01-01
# 6: 1 a 2015-02-01
# 7: 2 b 2015-02-01
# 8: 3 c 2015-02-01
# 9: 4 d 2015-02-01
#10: 5 f 2015-02-01

To complete the reverse operation, the large data.table can be split up in a list of smaller ones by:

lapply(unique(dt_long2$c), function(x) dt_long2[c == x])
#[[1]]
#   a b          c
#1: 1 a 2015-01-01
#2: 2 b 2015-01-01
#3: 3 c 2015-01-01
#4: 4 d 2015-01-01
#5: 5 f 2015-01-01
#
#[[2]]
#   a b          c
#1: 1 a 2015-02-01
#2: 2 b 2015-02-01
#3: 3 c 2015-02-01
#4: 4 d 2015-02-01
#5: 5 f 2015-02-01

Here, it is assumed that c is the discriminating variable.

Upvotes: 2

Related Questions