giovi_stick9
giovi_stick9

Reputation: 13

Sum column based on condition in another columns in R

I have to sum the values of a column based on an if statement. Here my code:

a <- c(1,2,3)
b <- c(2,2,3)
f <- c(1,2,3)
df <- data.frame(a,b,f)
df
for (i in 1:nrow(df)){
  if (df$a[i] == df$b[i]){
    w <- sum(df$f)
  }
}

My result is 6 while it should be 5, the sum of f[2]=2 + f[3]=3.

Thank you for help

Upvotes: 1

Views: 1727

Answers (2)

Eric
Eric

Reputation: 2849

data.table approach:

a <- c(1,2,3)
b <- c(2,2,3)
f <- c(1,2,3)
df <- data.frame(a,b,f)

library(data.table)

setDT(df)

df[,.(f_sum = sum(f[a==b]))][]

# Returns a data.table object:

#>    f_sum
#> 1:     5

# OR 

df[,(f = sum(f[a==b]))][]

# Returns a vector:

#> [1] 5

Created on 2021-03-16 by the reprex package (v0.3.0) **OR

Upvotes: 1

akrun
akrun

Reputation: 887741

We don't need a loop

with(df, sum(f[a == b]))
#[1] 5

Or for faster subset and sum, can use collapse

library(collapse)
fsum(fsubset(df, a == b)$f)
#[1] 5

In the loop, it can be changed to

w <- 0
for(i in seq_len(nrow(df))) {
    if(df$a[i] == df$b[i]) {
        w <- w + df$f[i]
   }
 }

w
#[1] 5

Upvotes: 1

Related Questions