Reputation: 3591
I want to do the equivalent of this SQL code:
SELECT Cat_1, Cat_2, Cat_3, Sum(cost), Sum(quantity) FROM tbl_of_data
GROUP BY Cat_1, Cat_2, Cat_3;
How do I do that using the dplyr package in R?
I tried:
library(dplyr)
tbl_of_data %>%
group_by(Cat_1, Cat_2, Cat_3) %>%
mutate(cost, quantity)
but I'm ending up with the same number of rows.
Upvotes: 1
Views: 655
Reputation: 886938
We need either summarise
or summarise_each
to get a single row of sum
for each combination of groups which would be equivalent to the sql
output.
tbl_of_data %>%
group_by(Cat_1, Cat_2, Cat_3) %>%
summarise_each(funs(sum), cost, quantity)
Or
tbl_of_data %>%
group_by(Cat_1, Cat_2, Cat_3) %>%
summarise(cost=sum(cost), quantity=sum(quantity))
mutate
creates a new column. In addition, in the OP's mutate
code, it didn't wrap with the sum
. If the intention is to create new sum
columns, either mutate
or mutate_each
can be used i.e. ...%>% mutate(SumCost= sum(cost), SumQuantity = sum(quantity))
or ...%>% mutate_each(funs(sum), cost, quantity)
Upvotes: 4