Phani
Phani

Reputation: 3315

How can I vectorize this task in R?

For a specific task, I have written the following R script:

pred <- c(0.1, 0.1, 0.1, 0.2, 0.2, 0.3, 0.3)
grp <- as.factor(c(1, 1, 2, 2, 1, 1, 1))

cut <- unique(pred)
cut_n <- length(cut)
n <- length(pred)
class_1 <- numeric(cut_n)
class_2 <- numeric(cut_n)
curr_cut <- cut[1]
class_1_c <- 0
class_2_c <- 0
j <- 1
for (i in 1:n){
if (curr_cut != pred[i]) {
    j <- j + 1
    curr_cut <- pred[i]
}
if (grp[i] == levels(grp)[1])
    class_1_c <- class_1_c + 1
else
    class_2_c <- class_2_c + 1
class_1[j] <- class_1_c
class_2[j] <- class_2_c
}
cat("index:", cut, "\n")
cat("class1:", class_1, "\n")
cat("class2:", class_2, "\n")

My goal above was to compute the cumulative number of times the factors in grp appear for each unique value in pred. For example, I get the following output for above:

index: 0.1 0.2 0.3 
class1: 2 3 5 
class2: 1 2 2 

I am a beginner in R and I have few questions about this:

  1. How can I make this code faster and simpler?
  2. Is is it possible to vectorize this and avoid the for loop?
  3. Is there a different "R-esque" way of doing this?

Any help would be greatly appreciated. Thanks!

Upvotes: 0

Views: 67

Answers (1)

MrFlick
MrFlick

Reputation: 206253

You can start by getting a the unique group/pred counts using a table

table(grp, pred)

#    pred
# grp 0.1 0.2 0.3
#   1   2   1   2
#   2   1   1   0

Of course this isn't exactly what you wanted. You want cumulative totals, so we can adjust this result by applying a cumulative sum across each row (transposed to better match your data layout)

t(apply(table(grp, pred), 1, cumsum))

# grp 0.1 0.2 0.3
#   1   2   3   5
#   2   1   2   2

Upvotes: 2

Related Questions