LoveMeow
LoveMeow

Reputation: 1181

Subsetting by factors in R loop

I am having difficulty subsetting my data by factors in a for loop. Here is a illustrative example:

x<-rnorm(n=40, m=0, sd=1)
y<-rep(1:5, 8)
df<-as.data.frame(cbind(x,y))
df_split<-split(df, df$y)
mean_vect<-rep(-99, 5)

for (i in c(1:5)) {
current_df<-df_split$i
mean_vect[i]<-mean(current_df)
}

`

This approach is not working because I think R is looking for a split called "i" when I really want it to pull out the ith split! I have also tried the subset function with little joy. I always run into these problems when I am trying to split on a non-numeric factor so any help would be appreciated

Upvotes: 0

Views: 951

Answers (2)

John
John

Reputation: 23758

FYI, the functionality to accomplish this is typically done using tapply

tapply( df$x, df$y, mean )

The first argument specifies the value you want to "mean-group". The second is just the INDEX, i.e. the variable that splits your groups and the last is obviously the function you want to run on these groups, in this case mean.

Upvotes: 3

Max
Max

Reputation: 4932

To get split number i run

df_split[[i]]

BTW, as your final aim is mean_vect you better to use

mean_vect <- lapply(df_split, mean)

or:

mean_vect <- tapply(df$x, df$y, mean)
mean_vect
        1          2          3          4          5 
0.2566810 -0.1528079 -0.2097333 -0.1540343  0.3609312 

Upvotes: 1

Related Questions