Vestlink
Vestlink

Reputation: 65

Convert list of numeric vectors into data frame and write number in sequence

How do i convert a list of lists in to a single data frame retaining the list name and then add the number of the sequence.

str(data) gives me this:

List of 230
 $ data_1  : num [1:19, 1:2] 0.0204 0.0516 0.0924 0.1424 0.2044 ...
 $ data_14 : num [1:19, 1:2] 0.006 0.0144 0.0272 0.0456 0.0712 ...
 $ data_2  : num [1:19, 1:2] 0.0292 0.0736 0.1316 0.202 0.286 ...
 $ data_27 : num [1:19, 1:2] 0.0056 0.0136 0.024 0.0384 0.0572 ...
 $ data_46 : num [1:19, 1:2] 0.0164 0.0408 0.0716 0.11 0.1588 ...
 $ data_510: num [1:19, 1:2] 0.0128 0.034 0.0652 0.1112 0.1756 ...
 $ data_13  : num [1:19, 1:2] 0.0064 0.0136 0.022 0.0332 0.046 ...
 $ data_19  : num [1:19, 1:2] 0.0036 0.0096 0.0224 0.0444 0.0776 ...
 $ data_080: num [1:19, 1:2] 0.0056 0.0132 0.0228 0.0356 0.052 ...
 $ data_15 : num [1:19, 1:2] 0.0028 0.0068 0.0116 0.0172 0.0244 ...
 $ data_18 : num [1:19, 1:2] 0.0008 0.0012 0.0024 0.0032 0.0044 0.0064 0.0096 0.014 0.02 0.0268 ...
 $ data_3  : num [1:19, 1:2] 0.0124 0.0308 0.0576 0.0932 0.1384 ...
 $ data_33 : num [1:19, 1:2] 0.0036 0.0084 0.016 0.0252 0.0372 ...
 $ data_500: num [1:19, 1:2] 0.004 0.0096 0.0196 0.0372 0.0648 ...
 $ data_015 : num [1:19, 1:2] 0.0072 0.0172 0.03 0.0456 0.064 ...
 $ data_02  : num [1:19, 1:2] 0.0132 0.0296 0.0484 0.0696 0.0936 ...
 $ data_04  : num [1:19, 1:2] 0.0072 0.0192 0.038 0.0692 0.1132 ...
 $ data_37  : num [1:19, 1:2] 0.0056 0.014 0.0252 0.0388 0.0552 ...
 $ data_4   : num [1:19, 1:2] 0.0072 0.0188 0.0352 0.056 0.0812 ...
 $ data_550 : num [1:19, 1:2] 0.004 0.0104 0.02 0.032 0.048 ...

... the list is repeated from 2 to 30 times

What I am looking for is something like this:

ID  Area    Size    Interval
data_1  0.0204  0.1     1
data_1  0.0516  0.15    1
data_1  0.0924  0.2     1
data_1  0.1424  0.25    1
data_14 0.006   0.1     1
data_14 0.0144  0.15    1
data_14 0.0272  0.2     1
data_14 0.0456  0.25    1
data_1  0.0204  0.1     1
data_1  0.0516  0.15    1
data_1  0.0924  0.2     1
data_1  0.1424  0.25    1
data_14 0.006   0.1     1
data_14 0.0144  0.15    1
data_14 0.0272  0.2     1
data_14 0.0456  0.25    1
data_1  0.0254  0.1     2
data_1  0.0566  0.15    2
data_1  0.0974  0.2     2
data_1  0.1474  0.25    2
data_14 0.011   0.1     2
data_14 0.0194  0.15    2
data_14 0.0322  0.2     2
data_14 0.0506  0.25    2
data_1  0.0254  0.1     2
data_1  0.0566  0.15    2
data_1  0.0974  0.2     2
data_1  0.1474  0.25    2
data_14 0.011   0.1     2
data_14 0.0194  0.15    2
data_14 0.0322  0.2     2
data_14 0.0506  0.25    2

i have tried lapply(data, data.frame) and do.call(rbind.data.frame, data)

But is not quite working the way i want it to...

Upvotes: 1

Views: 977

Answers (1)

akrun
akrun

Reputation: 887088

We can use data.table. Loop over the list, convert to data.frame, use rbindlist to vertically bind the data.frames in the list (the option idcol=TRUE ensure that a separate column is created based on the names of the list. We can use rle from base R along with ave to create a 'Seq' column for duplicate '.id' values that are not adjacent.

library(data.table)
rbindlist(lapply(data, as.data.frame), idcol=TRUE)[, Seq :=inverse.rle(within.list(rle(.id), 
                 values <- ave(values, values, FUN=seq_along)))][]

Or with dplyr, we do the vertical binding with bind_rows, create a grouping variable ('grp') based on whether the adjacent 'ID' values are same or not.

library(dplyr)
dM1 <- lapply(data, as.data.frame) %>% 
                bind_rows(., .id = "ID") %>%
                mutate(grp = cumsum(ID!= lag(ID, default="999")))

We get the unique rows of 'ID' and 'grp' selected above data, grouped by 'ID', create a sequence column with row_number() and do a right_join.

dM1 %>%
   select(ID, grp) %>% 
   unique() %>% 
   group_by(ID) %>%
   mutate(Seq = row_number())  %>%
   right_join(., dM1)  %>%
   select(-grp)

Update

Or a more easier approach would be to get the sequence grouped by the names of the list (i.e. 'data'), change the names by pasteing the sequence with the original names, convert the list of matrices to list of data.frames by looping through the list with lapply, bind the rows (bind_rows) specifying the .id, and separate the 'ID' column into two.

library(dplyr)
library(tidyr)
names(data) <- paste(names(data), ave(names(data), names(data),
                     FUN= seq_along), sep=",")

lapply(data, as.data.frame) %>%
        bind_rows(., .id = "ID") %>%
        separate(ID, into = c("ID", "Seq"), sep=",")

Upvotes: 4

Related Questions