Reputation: 113
I have a df with 10,000 columns (SNPs frequencies). I need to carry out a simulation (factor analysis) with non-repeating vectors. In order to do this, I need to carry out factor analysis on subsets of columns, divided in groups of 10. For example, cols 1:10, 11:20; 21:30. Since manually specifying this would take ages, I need a simple script that does it. I wrote this but it does not seem to work. I cannot figure out how to tell R when to start and stop each iteration.
ind=seq(1,(ncol(df)-10),by=10)
for (i in ind) { start=i;end=i+9; rez = factanal(df,factors=1, start:end) }
Upvotes: 0
Views: 84
Reputation: 11514
Just a small pointer:
groups <- seq(from=1, to=10000, by=10)
This may be useful for splitting up your columns into groups of 10. Then, for each element of group, you can add something like 0:9
. See
> 1 + 0:9
[1] 1 2 3 4 5 6 7 8 9 10
This can be used in subsetting your dataframe.
For instance,
for(i in groups){
your_function( dat[, i + 0:9] )
}
will execute your function with the corresponding data. Make sure to store the output of the function appropriately. It may be useful to wrap it into a lapply
call, as in
lapply(groups, function(x) your_function(dat[, x + 0:9]))
to save the output in a list.
While this may be an answer to your question, let me nevertheless add what I would do since I think this may help you more in the long run: Instead of looping over columns, I would melt
the dataframe into long format, create an index indicating groups of 10 as a new variable, and then use that variable as grouping variable in combination with dplyr
's group_by()
operations for grouped analysis.
Upvotes: 1