Reputation: 475
I have a data frame which looks like this:
sub = c("X001","X001", "X001","X002","X002","X001","X002","X001","X002","X002","X002","X002")
revenue = c(20, 15, -10,-25,20,-20, 17,9,14,12, -9, 11)
df = data.frame(sub, revenue)
I want to aggregate it in such a way that the second column should show the sum of all revenue for the sub, the third column should show the sum on absolute value, the fourth column should show the sum of all positive values and the fifth column should show the sum of all negative values.
The result should look like this:
Sub All Sum Absolute Sum Positive Sum Negative Sum
X001 14 74 44 -30
X002 40 108 74 -34
I've written code that calculates the All sum:
y<-aggregate(df$revenue, by=list(Feature=x$Sub), FUN=sum)
I would really appreciate it if someone more knowledgable in R would help me in calculating the other three columns.
Upvotes: 2
Views: 110
Reputation: 887701
We can also use data.table
library(data.table)
setDT(df)[, .(All_Sum = sum(revenue), Absolute_Sum = sum(abs(revenue)),
Positive_Sum = sum(revenue[revenue>0]), Negative_Sum = sum(revenue[revenue<0])), by = sub]
# sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
#1: X001 14 74 44 -30
#2: X002 40 108 74 -34
Upvotes: 0
Reputation: 12935
In base R using aggregate
:
aggregate(.~sub, df, function(a) c(sum(a), sum(abs(a)), sum(a[a>0]), sum(a[a<0])))
# sub revenue.1 revenue.2 revenue.3 revenue.4
#1 X001 14 74 44 -30
#2 X002 40 108 74 -34
Upvotes: 1
Reputation: 16277
Here's how to do that with dplyr:
library(dplyr)
df%>%
group_by(sub)%>%
summarise(All_Sum=sum(revenue),Absolute_Sum=sum(abs(revenue)),
Positive_Sum=(sum(revenue[revenue>0])),Negative_Sum=(sum(revenue[revenue<0])))
sub All_Sum Absolute_Sum Positive_Sum Negative_Sum
<fctr> <dbl> <dbl> <dbl> <dbl>
1 X001 14 74 44 -30
2 X002 40 108 74 -34
Upvotes: 3