Reputation: 585
I want to collapse the rows based on users while placing the '1' on their corresponding columns.
Each row for each user can only have one '1' so there need not be any adding to the rows following.
My df:
User +1 +2 +3 +4 +5
A 1 0 0 0 0
A 0 1 0 0 0
A 0 0 0 0 1
B 0 0 1 0 0
B 0 0 0 1 0
Expected result:
User +1 +2 +3 +4 +5
A 1 1 0 0 1
B 0 0 1 1 0
Any help would be appreciated.
Upvotes: 12
Views: 21297
Reputation: 70256
Looks like you can use summarise_each
:
df %>% group_by(User) %>% summarise_all(funs(sum))
Edit note: replaced summarise_each
which is now deprecated with summarise_all
Upvotes: 23
Reputation: 7248
The way that I would approach this would be to convert your data to long form first, then do the aggregation, and convert back out to wide form if necessary for display purposes.
So, using tidyr
,
df %>%
pivot_longer(cols=-User, names_to="rating", values_to="count") %>%
group_by(User, rating) %>%
summarise(count = max(count)) %>%
spread(rating, count)
The first gather converts to long form (using p
instead of +
):
> df <- read.table(header=TRUE, text='User p1 p2 p3 p4 p5
A 1 0 0 0 0
A 0 1 0 0 0
A 0 0 0 0 1
B 0 0 1 0 0
B 0 0 0 1 0
')
> df %>% pivot_longer(
cols=-User, names_to="rating", values_to="count")
# A tibble: 25 × 3
User rating count
<chr> <chr> <int>
1 A p1 1
2 A p2 0
3 A p3 0
4 A p4 0
5 A p5 0
6 A p1 0
7 A p2 1
8 A p3 0
9 A p4 0
10 A p5 0
# … with 15 more rows
And the remaining steps perform the aggregation, then transform back to wide format.
Upvotes: 3
Reputation: 92282
Here's alternatve dplyr
solution
df %>% group_by(User) %>% do(as.list(colSums(.)))
Or a data.table
possible implementation
library(data.table)
setDT(df)[, lapply(.SD, sum), User]
Or
setDT(df)[, as.list(colSums(.SD)), User]
Or with base R, even simpler
aggregate(. ~ User, df, sum)
Upvotes: 7