blep
blep

Reputation: 726

Iterating over lists stored in data.frame in R

I think this is a beginner question, but I don't appear to have the right vocabulary for an effective Google search.

I have a data.frame, final, which contains a list of clusters, each of which is a list of strings.

I would like to iterate over the list of strings in each cluster: a for loop within a for loop.

for (j in final$clusters){
    for (i in final$clusters$`j`){
        print final$clusters$`j`[i]
    }
}

j corresponds to the lists in clusters, and i corresponds to the items in clusters[j]

I was trying to do this by using the length of each cluster, which I thought would be something like length(final$clusters[1]), but that gives 1, not the length of list.

Also, final$clusters[1] gives $'1', and on the next line, all the strings in cluster 1.

Thanks.

EDIT: output of dput(str(final)), as requested:

List of 2
 $ clusters     :List of 1629
  ..$ 1   :
  ..$ 2   : 
  ..$ 3   : 
  ..$ 4   : 
  ..$ 5   : 
  ..$ 6   : 
  ..$ 7   : 
  ..$ 8   : 
  ..$ 9   : 
  ..$ 10  : 
  .. [list output truncated]
 $ cluster_stats: num [1:1629, 1:6] 0.7 0.7 0.7 0.7 0.7 0.7 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:1629] "1" "2" "3" "4" ...
  .. ..$ : chr [1:6] "min" "qu1" "median" "mean" ...
NULL

Upvotes: 5

Views: 17115

Answers (2)

agstudy
agstudy

Reputation: 121568

I think you confuse a list and a data.frame. I guess that your final is object is a list.

To iterate over the list You can use rapply. It is a recursive version of lapply.

For example:

## I create some reproducible example

cluster1 <- list(a='a',b='b')
cluster2 <- list(c='aaa',d='bbb')
clusters <- list(cluster1,cluster2)
final <- list(clusters)

So using rapply

rapply(final,f=print)
[1] "a"
[1] "b"
[1] "aaa"
[1] "bbb"
    a     b     c     d 
  "a"   "b" "aaa" "bbb" 

Update after edit by OP

Using lapply, I loop through the name of the list. For each name, I get the element list using [[ ( you can use [ if you wand to get names and heder for files), then I write the file using write.table. Here I use the name of the element in the list to create the file name. in your case you will have file name as number.(1.txt,...)

    lapply(names(final$clusters),
                      function(x)
                             write.table(x=final$clusters[[x]],
                                         file=paste(x,'.txt',sep='')))

Upvotes: 4

newmathwhodis
newmathwhodis

Reputation: 3289

I think the primary problem here is that the way you iterate here is wrong.

I think that something like this would work better:

for (j in final$clusters){
    for (i in final$clusters[j]){
        print i
    }
}

here is the documentation for loops: http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-For-Loop for subsetting: http://www.statmethods.net/management/subset.html

good luck

Upvotes: 4

Related Questions