Dolven
Dolven

Reputation: 39

Building a list of subsets of a data frame according to cell values in R

I have a data frame that I want to cut, and keep each subset as an element of a new list. The cuts are given according to the values of the cells. For instance, if I have:

> df

     X1   X2
1     red  1
2    blue  3
3   green  2
4  pierre 10
5    pink  4
6    blue  3
7   green  2
8    eric 25
9  purple  8
10    red  1
11   anna 30
12   blue  3
13  green  2
14  black  5
15 yellow  6
16  marie 40
17 violet  7 

> df2

      X1    X2  X3
1 pierre  eric  77
2   anna marie 100

I would like to cut df in subsets of which limits are the rows in which the value of X1 equals the values given by X1 (for the upper limit) and X2 (for the lower) in df2. To make it clearer, I want my list to look like this:

> list
[[1]] 
     X1   X2    
4  pierre 10
5    pink  4
6    blue  3
7   green  2
8    eric 25
[[2]]
     X1   X2
11   anna 30
12   blue  3
13  green  2
14  black  5
15 yellow  6
16  marie 40

I tried to do do it using a for loop:

> for (i in 1:nrow(df2)){
   list[i]<-list(df[which(df[,"X1"]==df2[i,"X1"]):which(df[,"X1"]==df2[i,"X2"]),])
  }     

But I get the following error message:

Error in list[i] <- list(df[which(df[, "X1"] == df2[i, "X1"]):which(df[,  : 
  object of type 'builtin' is not subsettable

Do you know what is wrong and/or a different way to get the expected result ?

Upvotes: 0

Views: 37

Answers (2)

LAP
LAP

Reputation: 6685

Using a for loop works, though @zx8754's mapply() approach should be more efficient.

test <- vector("list", nrow(df2))
for(i in 1:nrow(df2)){
  x <- which(df[, "X1"] == df2[i, "X1"])
  y <- which(df[, "X1"] == df2[i, "X2"])
  test[[i]] <- df[x:y,]
}

> test
[[1]]
      X1 X2
4 pierre 10
5   pink  4
6   blue  3
7  green  2
8   eric 25

[[2]]
       X1 X2
11   anna 30
12   blue  3
13  green  2
14  black  5
15 yellow  6
16  marie 40

Upvotes: 1

zx8754
zx8754

Reputation: 56169

Using mapply:

mapply(function(x, y){
  df[ which(df$X1 == x):which(df$X1 == y), ]
  }, x = df2$X1, y = df2$X2, SIMPLIFY = FALSE)
# $pierre
# X1 X2
# 4 pierre 10
# 5   pink  4
# 6   blue  3
# 7  green  2
# 8   eric 25
# 
# $anna
# X1 X2
# 11   anna 30
# 12   blue  3
# 13  green  2
# 14  black  5
# 15 yellow  6
# 16  marie 40

Upvotes: 1

Related Questions