Reputation: 317
I am trying to make a conditional sum of values in a column provided that they share the same Country, Year, and Age and divide the whole sum by a value given by Num. My dataframe looks something like this:
Country Num Year Age Variable
AU 20 1989 12 0.009
AU 20 1989 12 0.0015
AU 20 1989 13 0.008
AU 19 1990 19 0.005
USA 1 2011 55 0.0001
I have tried so far by using dplyr but I am confused on how to condition effectively and how to divide by Num
dta %>% mutate(Variable) %>% group_by(Country,Year,Age) %>%
summarise(ME=sum(Variable)[,"ME"], MAE=sum(Variable)[,"MAE"])
Any help is greatly appreciated!
Upvotes: 2
Views: 128
Reputation: 4472
Using data.table
you could do this
library(data.table)
setDT(dat)[, ME := sum(Variable)/sum(Num), by = .(Country, Year, Age)]
#> dat
# Country Num Year Age Variable ME
#1: AU 20 1989 12 0.0090 0.0002625000
#2: AU 20 1989 12 0.0015 0.0002625000
#3: AU 20 1989 13 0.0080 0.0004000000
#4: AU 19 1990 19 0.0050 0.0002631579
#5: USA 1 2011 55 0.0001 0.0001000000
Upvotes: 3
Reputation: 7190
Here is a dplyr
solution for what (I guessed) you are trying to achieve.
dat %>% group_by(Country, Year, Age) %>% mutate(ME = sum(Variable)/sum(Num))
Source: local data frame [5 x 6]
Groups: Country, Year, Age
Country Num Year Age Variable ME
1 AU 20 1989 12 0.0090 0.0002625000
2 AU 20 1989 12 0.0015 0.0002625000
3 AU 20 1989 13 0.0080 0.0004000000
4 AU 19 1990 19 0.0050 0.0002631579
5 USA 1 2011 55 0.0001 0.0001000000
Upvotes: 3
Reputation: 4387
I'm not sure I completely understand what you mean by divide the whole sum by a value given by Num, but here is a simple solution:
ddply(dta, c('Country','Year','Age'), function(d) {
data.frame(wsum = sum(d$Variable)/sum(d$Num))
})
Upvotes: 2