Reputation: 23
I have many large data frames. Using of the smaller ones for example:
dim(ch29)
476 4283
I need to split it into smaller pieces (i.e. subset into 241 columns at the most). My problems come afterwards when I want to analyze these smaller subsets.
I do not know how to subset the large date-frame into smaller data-frames and not simply a list.
I also want to do all of this in a loop and give the newly created smaller data frames unique names in the loop.
chunk=241
df<-ch29
n<-ceiling(ncol(df)/chunk)
for (i in 1:n) {
xname <- paste("ch29", i, sep="_")
cat("_", xname)
assign(xname, split(df, rep(1:n, each=chunk, length.out=ncol(df))))
}
Upvotes: 1
Views: 603
Reputation: 93761
I'm not exactly sure what you're trying to do or how you want to choose the columns that go in each data frame, but here's an example of one option:
# Fake data
set.seed(100)
ch29 = as.data.frame(replicate(4283, rnorm(476)))
# Number of columns we want in each split data frame
ncols = floor(ncol(ch29)/20)
# Start column for each split data frame
start = seq(1,ncol(ch29),ncols)
# Split ch29 into a bunch of separate data frames
df.list = lapply(setNames(start, paste0("ch29_", start, "_", start+ncols-1)),
function(i) ch29[ , i:min(i+ncols-1,ncol(ch29))])
You now have a list, df.list
, where each list element is a data frame with ncols
columns from ch29
, except for the last element of the list, which will have between 1 and ncols
columns. Also, the name of each list element is the name of the parent data frame (ch29
) and the column range from which the subset data frame is drawn.
Upvotes: 3
Reputation: 5068
Try
for (i in 1:3) { # i = 1
xname = paste("ch29", i, sep = "_")
col.min = (i - 1) * chunk + 1
col.max = min(i * chunk, ncol(df))
assign(xname, df[,col.min:col.max])
}
In other words, use the notation df[,a:b]
, where a < b
, to get the subset of the dataframe df
consisting only of columns a
to b
.
Upvotes: 1