Nick
Nick

Reputation: 45

R doubling list length when subsetting

I am currently trying to subset a list in R from a dataframe. My current attempt looks like:

list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))

for(i in list.level){
  bucket.group[[i]] <- subset(buckets$group,buckets$group == i)
}

However, instead of filling the list it seems to create a duplicate list of the same amount of rows, returning:

[[1]]
NULL

[[2]]
NULL

...

NULL

[[22]]
NULL

[[23]]
NULL

$A
[1] "A"

$C
[1] "C" "C" "C"

$D
[1] "D" "D" "D"

...

$AJ
[1] "AJ" "AJ" "AJ" "AJ" "AJ"

$AK
[1] "AK" "AK"

A should be filling into 1, C into 2, etc. etc. How do I get these to fill in the original rows rather than creating extra rows at the bottom of the list?

Upvotes: 0

Views: 68

Answers (3)

John Paul
John Paul

Reputation: 12664

I think the issue is with your for statement.

Your code is like this:

list.level<-letters[1:10]
> for(i in list.level) print(i)
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[1] "f"
[1] "g"
[1] "h"
[1] "i"
[1] "j"

It assigns each element in list.level to i, so i is a letter. When you do
bucket.group[[i]] <- subset(buckets$group,buckets$group == i) in the first iteration, i is a letter. So it looks for a list element called bucket.group[["a"]] and does not find it, so it creates it and stores the data there. If instead you use seq_along

for(i in seq_along(list.level)) print(i)
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10

now i will alway be a number and the code will do what you want.

So use seq_along instead.

Upvotes: 1

Manish Goel
Manish Goel

Reputation: 893

this should work:

list.level <- unique(buckets$group)
bucket.group <- vector("list",length(list.level))

for(i in 1:length(list.level)){
  bucket.group[[i]] <- subset(buckets$group,buckets$group == list.level[i])
}

Upvotes: 0

user31264
user31264

Reputation: 6727

Here is what is going on. Suppose your buckets$group is c("a","a","b","b").

list.level <- unique(buckets$group)

Now list.level is c("a","b")

bucket.group <- vector("list",length(list.level))

Since length(list.level) is 2, now your bucket.group is a list of 2 NULL elements, their names are 1 and 2.

for(i in list.level){

Recalling the value of list.level, it is the same as for i in c("a","b").

     bucket.group[[i]] <- subset(buckets$group,buckets$group == i)

Since i loops over "a" and "b", you now fill bucket.group[["a"]] and bucket.group[["b"]], while bucket.group[[1]] and bucket.group[[2]] remain intact.

To fix this, you should write instead

list.level <- unique(buckets$group) # ok, this was correct
bucket.group <- list() # just empty list   
for(i in 1:length(list.level)){
  bucket.group[[i]] <- buckets$group[buckets$group == list.level[[i]] ]
}

Upvotes: 1

Related Questions