Reputation: 1
I have a similar problem to a previous question by another user How to sum a variable by group?, but I have more than two variables in my dataframe. It looks a little like this:
A B C D E
1 m 1990 1989 200
1 m 1990 1990 100
1 m 1991 1989 10
2 m 1991 1990 20
2 m 1991 1991 100
3 m 1992 1989 30
3 m 1992 1990 20
3 m 1992 1991 10
4 m 1992 1992 10
4 m 1993 1989 50
I want to lose the variable D and sum up E for every same value in A, B and C, without losing the other variables. I tried the advice given in the link above (aggregate, by, etc) but I ended up with only two variables. I want something like this:
A B C E
1 m 1990 300
1 m 1991 10
2 m 1991 120
3 m 1992 30
3 m 1992 30
4 m 1992 10
4 m 1993 50
Thank you in advance!
(This is my first question, so please let me know if it's inappropriate / missing something.)
Upvotes: 0
Views: 488
Reputation: 1213
Check out the dplyr package. The solution would be somthing like :
library(dplyr)
data <- your_data
data_summed<- data %>% group_by(A, B, C) %>% mutate(F = sum(E))
dplyr's filter()
can then be used to select only the columns of interest for your final data.frame.
For variations, check out this cheatsheet; its great.
Upvotes: 0
Reputation: 1095
I think aggregate(E ~ A + B + C, data=df, FUN=sum)
should do the trick. This splits the data on columns A, B and C and computes the sum of E.
> aggregate(e ~ a+b+c, data=df, FUN=sum)
a b c e
1 1 m 1990 300
2 1 m 1991 10
3 2 m 1991 120
4 3 m 1992 60
5 4 m 1992 10
6 4 m 1993 50
Upvotes: 0