Reputation: 8464
I have the dataframe below
name<-c("Jack","Bob","Jack","Bill","Jack","Bob")
items<-c("car","house","ball","desk","bike","chair")
d<-data.frame(name,item)
name items
1 Jack car
2 Bob house
3 Jack ball
4 Bill desk
5 Jack bike
6 Bob chair
and I want to convert it to a dataframe in which the unique items
will be summarized based on the name
and a new column will be added with their count, so it will be like:
name items count
1 Jack car,ball,bike 3
2 Bob house,chair 2
3 Bill chair 1
Upvotes: 0
Views: 56
Reputation: 4534
d %>% group_by(name) %>%
mutate(foo = paste0(items, collapse = ",")) %>%
mutate(count_w = length(foo)) %>%
dplyr::select(-items) %>%
distinct()
# A tibble: 3 x 3
# Groups: name [3]
name foo count_w
<chr> <chr> <int>
1 Jack car,ball,bike 3
2 Bob house,chair 2
3 Bill desk 1
Upvotes: 1
Reputation: 5284
A base R solution without extra packages
d <-data.frame(name = c("Jack","Bob","Jack","Bill","Jack","Bob"),
items = c("car","house","ball","desk","bike","chair"))
Get the frequencies for names
and add another column for the concatenated items
.
result <- margin.table(table(d), 1)
sdf <- data.frame(items = paste(d$items[d$name == names(result)], collapse = ", "), result)
reorder columns
sdf <- sdf[, c(2, 1, 3)]
sdf
#> name items Freq
#> 1 Bill house, ball, desk 1
#> 2 Bob house, ball, desk 2
#> 3 Jack house, ball, desk 3
Created on 2021-01-10 by the reprex package (v0.3.0)
Upvotes: 1
Reputation: 1587
Although I'm not convinced that a comma-separated sequence is the best way to approach further data processing, here is code that does what you want:
library(dplyr)
d %>%
group_by(name) %>%
summarize(count = n(),
items = toString(items)) %>%
ungroup()
# A tibble: 3 x 3
name count items
<chr> <int> <chr>
1 Bill 1 desk
2 Bob 2 house, chair
3 Jack 3 car, ball, bike
Upvotes: 2