blue-dino
blue-dino

Reputation: 133

R: how to sum columns grouped by a factor?

If I have a table like this:

user,v1,v2,v3
a,1,0,0
a,1,0,1
b,1,0,0
b,2,0,3
c,1,1,1

How to I turn it into this?

user,v1,v2,v3
a,2,0,1
b,3,0,3
c,1,1,1

Upvotes: 2

Views: 6311

Answers (2)

BChan
BChan

Reputation: 93

In base R,

D <- matrix(c(1, 0, 0,
              1, 0, 1,
              1, 0, 0,
              2, 0, 3,
              1, 1, 1),
            ncol=3, byrow=TRUE, dimnames=list(1:5, c("v1", "v2", "v3")))
D <- data.frame(user=c("a", "a", "b", "b", "c"), D)
aggregate(. ~ user, D, sum)

Returns

> aggregate(. ~ user, D, sum)
  user v1 v2 v3
1    a  2  0  1
2    b  3  0  3
3    c  1  1  1

Upvotes: 6

potterzot
potterzot

Reputation: 648

You can use dplyr for this:

library(dplyr)
df = data.frame(
  user = c("a", "a", "b", "b", "c"),
  v1   = c(1, 1, 1, 2, 1),
  v2   = c(0, 0, 0, 0, 1),
  v3   = c(0, 1, 0, 3, 1))

group_by(df, user) %>% 
summarize(v1_sum = sum(v1),
          v2_sum = sum(v2),
          v3_sum = sum(v3))      

If you're not familiar with the %>% notation, it is basically like piping from bash. It takes the output from group_by() and puts it into summarize(). The same thing would be accomplished this way:

by_user = group_by(df, user)
df_summarized = summarize(by_user, 
                          v1_sum = sum(v1),
                          v2_sum = sum(v2),
                          v3_sum = sum(v3))  

Upvotes: 2

Related Questions