Reputation: 13
I have to sum the values of a column based on an if statement. Here my code:
a <- c(1,2,3)
b <- c(2,2,3)
f <- c(1,2,3)
df <- data.frame(a,b,f)
df
for (i in 1:nrow(df)){
if (df$a[i] == df$b[i]){
w <- sum(df$f)
}
}
My result is 6 while it should be 5, the sum of f[2]=2 + f[3]=3.
Thank you for help
Upvotes: 1
Views: 1727
Reputation: 2849
data.table
approach:
a <- c(1,2,3)
b <- c(2,2,3)
f <- c(1,2,3)
df <- data.frame(a,b,f)
library(data.table)
setDT(df)
df[,.(f_sum = sum(f[a==b]))][]
# Returns a data.table object:
#> f_sum
#> 1: 5
# OR
df[,(f = sum(f[a==b]))][]
# Returns a vector:
#> [1] 5
Created on 2021-03-16 by the reprex package (v0.3.0) **OR
Upvotes: 1
Reputation: 887741
We don't need a loop
with(df, sum(f[a == b]))
#[1] 5
Or for faster subset and sum, can use collapse
library(collapse)
fsum(fsubset(df, a == b)$f)
#[1] 5
In the loop, it can be changed to
w <- 0
for(i in seq_len(nrow(df))) {
if(df$a[i] == df$b[i]) {
w <- w + df$f[i]
}
}
w
#[1] 5
Upvotes: 1