elia
elia

Reputation: 13

r sum all columns based on one value

Got a df:

ID   Val1    Val2    Val3  
A    1        1       1
A    1        1       1
A    1        1       1
B    0        0       1

I want to take the sum of all the columns, based on a unique ID value. Like this:

ID   Val1    Val2    Val3     
A     3       3       3
B     0       0       1

I tried:

df %>% group_by(ID) %>% summarise_all(funs(sum()))

Anyone have an advice about what I' m doing wrong? I prefer a dplyr approach (if possible).

Upvotes: 1

Views: 105

Answers (2)

J Louro
J Louro

Reputation: 33

Edited*:

I don't know a solution using dplyr, but I do another solution using plyr, if interested.

You have:

   df=data.frame(id=c("A","A","A","B"), Val1=c(1,1,1,0), Val2=c(1,1,1,0),Val3=c(1,1,1,1))

> df
  id Val1 Val2 Val3
1  A    1    1    1
2  A    1    1    1
3  A    1    1    1
4  B    0    0    1

Using the plyr libray

> library(plyr)

> ddply(df,.(id),summarize,Val1=sum(Val1),Val2=sum(Val2),Val3=sum(Val3))

Output:

  id Val1 Val2 Val3
1  A    3    3    3
2  B    0    0    1

Upvotes: 0

jdobres
jdobres

Reputation: 11957

You need to remove the parentheses after sum, i.e., your code should read:

df %>% group_by(ID) %>% summarise_all(funs(sum))

Typing sum() in this case calls the function, whereas passing just the name of the function sends it to be used by summarise_all. It's the difference between saying "use this function here and now," versus, "pass the function, as a parameter, to some other function". Similarly, typing, ?sum will bring you the documentation for the function, but ?sum() is invalid.

Upvotes: 3

Related Questions