Reputation: 13
Got a df:
ID Val1 Val2 Val3
A 1 1 1
A 1 1 1
A 1 1 1
B 0 0 1
I want to take the sum of all the columns, based on a unique ID value. Like this:
ID Val1 Val2 Val3
A 3 3 3
B 0 0 1
I tried:
df %>% group_by(ID) %>% summarise_all(funs(sum()))
Anyone have an advice about what I' m doing wrong? I prefer a dplyr approach (if possible).
Upvotes: 1
Views: 105
Reputation: 33
Edited*:
I don't know a solution using dplyr
, but I do another solution using plyr
, if interested.
You have:
df=data.frame(id=c("A","A","A","B"), Val1=c(1,1,1,0), Val2=c(1,1,1,0),Val3=c(1,1,1,1))
> df
id Val1 Val2 Val3
1 A 1 1 1
2 A 1 1 1
3 A 1 1 1
4 B 0 0 1
Using the plyr libray
> library(plyr)
> ddply(df,.(id),summarize,Val1=sum(Val1),Val2=sum(Val2),Val3=sum(Val3))
Output:
id Val1 Val2 Val3
1 A 3 3 3
2 B 0 0 1
Upvotes: 0
Reputation: 11957
You need to remove the parentheses after sum
, i.e., your code should read:
df %>% group_by(ID) %>% summarise_all(funs(sum))
Typing sum()
in this case calls the function, whereas passing just the name of the function sends it to be used by summarise_all
. It's the difference between saying "use this function here and now," versus, "pass the function, as a parameter, to some other function". Similarly, typing, ?sum
will bring you the documentation for the function, but ?sum()
is invalid.
Upvotes: 3