Reputation: 13
I would like to create a dataframe using the group_by function and then sum a column based on the group_by. So far, I've only been able to sum the entire column rather than within the group.
I have a dataframe:
old_df <- data_frame(category1 = c("a", "a", "b", "b"),
category2 = c("2", "1", "3", "4"))
From here, I would like to group_by category1 ("a" and "b") and sum category2 for "a" and "b" individually. It would look like this:
new_df <- data_frame(category1 = c("a", "b"),
Sum_category2 = c("3", "7"))
I've tried a few things, and I thought this one below should work.
new_df <- old_df %>%
group_by(category1) %>%
summarize(Sum_category2 = sum(category2))
Everything I've tried so far just sums up the entire category2 column, which in this case would equal 10. How can I make it sum only within the grouping?
Upvotes: 0
Views: 80
Reputation: 11696
I'm not sure why you're using strings for category 2 but the following works just fine.
library(dplyr)
old_df <- data.frame(category1 = c("a", "a", "b", "b"),
category2 = c(2, 1, 3, 4))
old_df %>% group_by(category1) %>% summarize(sum_category = sum(category2))
old_df
# A tibble: 2 x 2
category1 sum_category
<fct> <dbl>
1 a 3
2 b 7
Upvotes: 1