Thibault Schvartz
Thibault Schvartz

Reputation: 1

How do I make an iterative calculation without using a loop?

I have a "big" data frame where I need to do a calcul like below :

data <- data.frame( "name"=c( "Tom", "Peter", "Peter", "Peter", "Tom", "Peter" ), "goal"=c(1,-2,2,3,-1,0), "total"=0 )
for( i in 1:nrow(data) ) {
  count <- 0
  for ( j in 1:i) {
    if (data$name[j] == data$name[i]) {
      count <- count + data$goal[j]
    }
  }
  data$total[i] <- count
}

> data
   name goal total
1   Tom    1     1
2 Peter   -2    -2
3  John    2     2
4 Peter    3     1
5   Tom   -1     0
6 Peter    0     1

I need to perform the calculation of the "total" column by adding the goals scored before.

My database is currently 83000 rows long and the calculation is very long. I would like to do this calculation without a "for" loop. Do you have an idea ?

I saw the following post but I don't know how to adapt it.

Thanks in advance

Upvotes: 0

Views: 47

Answers (1)

Valkyr
Valkyr

Reputation: 431

If you want to avoid for loops, try to find vectorized functions that do what you want. (Or functions working on dataframes or other multidimensional objects). For your example you can separate the dataframe according to name using group_by from dplyr and then use the vectorized function cumsum (cumulative sum):

library(dplyr)
data <- data %>% group_by(name) %>% mutate(total = cumsum(goal))

Output

> data
# A tibble: 6 x 3
# Groups:   name [2]
  name   goal total
  <chr> <dbl> <dbl>
1 Tom       1     1
2 Peter    -2    -2
3 Peter     2     0
4 Peter     3     3
5 Tom      -1     0
6 Peter     0     3

I used your dataframe initialization in your post, which is why I get a different output than yours.

If you want to drop the grouping after your manipulation, use ungroup.

Upvotes: 1

Related Questions