Reputation: 1169
I have a data set as I've shown below:
df_A <- tribble(
~shop_name, ~id,
"A", "1, 4, 3",
"B", "4, 6",
"C", "3, 5",
"D", "6, 2, 3",
)
Now, I want to group the data by each unique id which in the vector, and then count them accordingly. Do you think it is doable? How can I manipulate the data to get my desired data?
desired <- tribble(
~id, ~counted_id,
"1", 1,
"2", 1,
"3", 3,
"4", 2,
"5", 1,
"6", 2,
)
Upvotes: 1
Views: 122
Reputation: 3876
Minimalist tidyverse
solution:
library(tidyverse)
df_A %>%
separate_rows(id) %>%
count(id, name = "counted_id")
# A tibble: 6 x 2
id counted_id
<chr> <int>
1 1 1
2 2 1
3 3 3
4 4 2
5 5 1
6 6 2
Upvotes: 4
Reputation: 33488
In base R
id_vector <- unlist(strsplit(df_A$id, ", "))
data.frame(table(id_vector))
# id_vector Freq
# 1 1 1
# 2 2 1
# 3 3 3
# 4 4 2
# 5 5 1
# 6 6 2
Using stringr::str_count()
:
ids <- as.character(1:7)
data.frame(
id = ids,
counted_id = sapply(ids, function(x) sum(str_count(df_A$id, x)))
)
# id counted_id
# 1 1 1
# 2 2 1
# 3 3 3
# 4 4 2
# 5 5 1
# 6 6 2
# 7 7 0
Upvotes: 3
Reputation: 46908
You can do with by splitting your string on ", " and table:
as.data.frame(table(unlist(strsplit(df_A$id,", "))))
Var1 Freq
1 1 1
2 2 1
3 3 3
4 4 2
5 5 1
6 6 2
Upvotes: 5
Reputation: 24790
Here's an approach with separate_rows
from tidyr
:
library(tidyr)
library(dplyr)
df_A %>%
separate_rows(id, sep=", ") %>%
group_by(id) %>%
dplyr::summarize(counted_id = n())
## A tibble: 6 x 2
# id counted_id
# <chr> <int>
#1 1 1
#2 2 1
#3 3 3
#4 4 2
#5 5 1
#6 6 2
Upvotes: 4