Drashti
Drashti

Reputation: 105

Loop through one column with unique values, and group by another to calculate the variance

Can someone please help me with this. I have a data frame as the one below:

 Scores    ID     Value
 M1        A       5.67
 M1        A       9.99
 M2        A       10.96
 M1        A       7.89
 M1        B       9.36
 M3        A       4.56
 M2        A       5.55
 M1        B       8.97

In this data set, the scores are cell types and the IDs repeat for each cell type (i.e M1 has 3 As) I want to loop through each score (cell types) and then calculate the variance for that score for each individual repeating ID (in this case A). So essentially I am measuring the variability for each score within each ID (not between).

Below is the code I thought of but, all the values in the .csv file have the same output.

for (i in df1$scores) {
  T1 <- aggregate(value ~ ID, df1, function(x) c(Var=var(x), Count=length(x)))
  T1
  write.csv(T1,file=paste0(i,"_withinID.csv"))
} 

Upvotes: 1

Views: 553

Answers (1)

akrun
akrun

Reputation: 887213

Here we need to loop over the unique values of the 'scores' and the aggregate should be based on the subset of the data

for(i in unique(df1$scores)) {
    T1 <- aggregate(value ~ ID, subset(df1, scores == i), 
         function(x) c(Var = var(x), Count = length(x)))
   write.csv(T1, file = paste0(i, "_withinID.csv"))
}

Upvotes: 1

Related Questions