Reputation: 5897
I am working with the R programming language. I have a list ("my_list") that looks something like this - each element in the list (e.g. [[i]]) has a different number of subelements (e.g. [[i]][j]) :
> my_list
my_list
[[1]]
[1] "subelement1" "subelement2" "subelement3"
[[2]]
[1] "subelement1" "subelement2" "subelement3" "subelement4" "subelement5"
[[3]]
[1] "subelement1" "subelement2" "subelement3" "subelement4" "subelement5"
[[4]]
[1] "subelement1" "subelement2" "subelement3" "subelement4" "subelement5"
> summary(my_list)
Length Class Mode
[1,] 3 -none- character
[2,] 5 -none- character
[3,] 5 -none- character
[4,] 5 -none- character
[5,] 5 -none- character
[6,] 5 -none- character
[7,] 5 -none- character
[8,] 5 -none- character
[9,] 5 -none- character
[10,] 5 -none- character
[11,] 5 -none- character
[12,] 6 -none- character
For each element in this list, I want to extract each of these subelement and make them into a dataframe all together (each row in this dataframe will not necessarily have the same number of columns). Since I don't the maximum number of subelements, I tried to find out the maximum number of subelements - but some parsing is still involved (many entries in the "Length" column are not numbers for some reason?):
summary = summary(my_list)
> summary
Var1 Var2 Freq
1 A Length 3
2 B Length 5
3 C Length 5
4 D Length 5
5 E Length 5
6 F Length 5
7 G Length 5
8 H Length 5
####
96 R3 Length 5
97 S3 Length 5
98 T3 Length 5
99 U3 Length 5
100 V3 Length 5
####
101 A Class -none-
102 B Class -none-
103 C Class -none-
104 D Class -none-
######
296 R3 Mode character
297 S3 Mode character
298 T3 Mode character
299 U3 Mode character
300 V3 Mode character
Next:
summary = data.frame(summary)
freq = as.numeric(gsub("([0-9]+).*$", "\\1", summary$Freq))
freq = freq[!is.na(freq)]
> max(freq)
[1] 6
With this very "roundabout way" - I now know there at most 6 subelements, and I can create 6 corresponding columns:
col1 = sapply(my_list,function(x) x[1])
col2 = sapply(my_list,function(x) x[2])
col3 = sapply(my_list,function(x) x[3])
col4 = sapply(my_list,function(x) x[4])
col5 = sapply(my_list,function(x) x[5])
col6 = sapply(my_list,function(x) x[6])
#final answer : desired output
final_data = data.frame(col1, col2, col3, col4, col5, col6)
My Question: Would there have been an easier way to find out the maximum number of subelements in this list and then create a data frame with the correct number of columns? I.e. Is there an "automatic" way to create a data frame with the same number of columns as subelements in the list and name these columns accordingly (e.g. col1, col2, col3, etc.)?
Thanks!
Upvotes: 0
Views: 545
Reputation: 4425
Try this
mx <- max(sapply(my_list , length))
df <- do.call(rbind , lapply(my_list , \(x) if(length(x) == mx) x
else c(x , rep(NA , mx - length(x)))))
df <- data.frame(df)
colnames(df) <- paste0("col" , 1:mx)
col1 col2 col3 col4 col5
1 subelement1 subelement2 subelement3 <NA> <NA>
2 subelement1 subelement2 subelement3 subelement4 subelement5
3 subelement1 subelement2 subelement3 subelement4 subelement5
4 subelement1 subelement2 subelement3 subelement4 subelement5
Upvotes: 1
Reputation: 524
Your solution is functional, so obviously take this with a grain of salt, but it's possible to find the maximum length of a sublist with one loop.
max_length <- 0
lapply(my_list, \(x){if (length(x) > max_length){max_length = length(x)} }
> max_length
[1] 6
To make a dataframe with the corresponding columns a similar approach can be used:
#create an empty dataframe to add rows to
df <- data.frame(matrix(ncol = max_length, nrow = 0))
colnames(df) <- sprintf("raster[%d]",seq(1:max_length))
#add rows
lapply(listanswer, \(x){df[nrow(df) + 1,] <- x})
See this post regarding sprintf
. Since you need to know the maximum row length going in, two loops are necessary, one to find the max length, and one to fill the data frame.
Upvotes: 1