Reputation: 1012
I have some data frames which comes on different iterations of my code. Let it be some 100 iterations. Each time i write the data frame to df
which i use to store the upcoming frame.
The data frames are
first iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
second iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
third iteration
V1 V2 V3 V4
5.1 3.5 1.4 0.2
4.9 3.0 1.4 0.2
4.7 3.2 1.3 0.2
4.6 3.1 1.5 0.2
5.0 3.6 1.4 0.2
and so on
Now at the end I want to have all the data frames in a list so I can process the list for other operation. How do I do this?
Here is a sample code
data = list.files(pattern=".csv")
data1 = lapply(data, function(x) read.csv(x, header = TRUE))
files = length(data1)
for(i in 1:length(files))
{
......
code
......
}
df ## say some df is generated each time
Upvotes: 1
Views: 198
Reputation: 8205
From the comments, I understand you are trying to generate a list of data.frame objects over sequential iterations of some algorithm - each of which produces a new data.frame.
Suppose we have some function f()
which generates a new data.frame, from some source, and perhaps uploads the data.frame before returning it.
f <- function() {
# read a file, do some work, produce a dataframe, etc
df # return the new data.frame()
}
The problem with using append
or something similar to add the new data.frame to the list is that is has a habit of "unrolling" the frame and merging it in.
Instead, your code needs a structure like this:
output_list <- list() # A list to hold the generated frames
while (more_work_to_do) {
df <- f() #One iteration
output_list[[length(output_list)+1]] <- df
}
# At this point, output_list is a list of the generated data frames
# with all their internal structure preserved.
It's important to use the [[]]
operator for the insert to avoid the " number of items to replace is not a multiple of replacement length" error. The length(output_list)+1
construct simply means "one past the current end of the array" and in effect does an append for you without needing to maintain a separate counter.
Here's an example
> f<-function() { data.frame(x=rnorm(5), y=rnorm(5)) }
> output_list <- list()
> for (i in 1:5) output_list[[length(output_list)+1]] <- f()
> length(output_list)
[1] 5
> str(output_list)
List of 5
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] -0.347 0.194 -0.406 -0.384 2.24
..$ y: num [1:5] -0.756 0.3417 -0.7542 0.1612 -0.0494
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 0.667 -0.186 0.602 -0.239 1.516
..$ y: num [1:5] 0.263 -1.322 0.604 -0.135 -0.339
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 1.064 -0.365 -1.584 0.163 0.142
..$ y: num [1:5] -0.0782 1.3314 0.0797 -0.4096 0.4819
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] -2.0448 -0.4228 -0.5305 -0.0611 0.4114
..$ y: num [1:5] -0.608 -0.74 -0.196 -0.957 0.653
$ :'data.frame': 5 obs. of 2 variables:
..$ x: num [1:5] 0.582 -1.029 -1.222 1.755 0.259
..$ y: num [1:5] 1.733 0.319 -0.597 -1.814 0.446
> output_list
[[1]]
x y
1 -0.3474823 -0.75595301
2 0.1941049 0.34170577
3 -0.4055180 -0.75424689
4 -0.3838479 0.16122522
5 2.2397387 -0.04936943
[[2]]
x y
1 0.6674517 0.2625242
2 -0.1859460 -1.3219586
3 0.6020241 0.6042548
4 -0.2387514 -0.1345904
5 1.5158875 -0.3392787
[[3]]
x y
1 1.0639814 -0.07823834
2 -0.3645768 1.33144410
3 -1.5839606 0.07973743
4 0.1630311 -0.40957609
5 0.1420562 0.48187377
[[4]]
x y
1 -2.04475082 -0.6083283
2 -0.42280601 -0.7396052
3 -0.53048188 -0.1961052
4 -0.06107144 -0.9571272
5 0.41136718 0.6526753
[[5]]
x y
1 0.5821866 1.7325293
2 -1.0289847 0.3186825
3 -1.2218606 -0.5971967
4 1.7548963 -1.8136810
5 0.2592219 0.4463977
>
Upvotes: 1