datazang
datazang

Reputation: 1169

How to count the strings in a vector?

I have a data set as I've shown below:

df_A <- tribble(
      ~shop_name,      ~id,      
        "A",         "1, 4, 3", 
        "B",         "4, 6",   
        "C",         "3, 5",   
        "D",         "6, 2, 3", 
)

Now, I want to group the data by each unique id which in the vector, and then count them accordingly. Do you think it is doable? How can I manipulate the data to get my desired data?

desired <- tribble(
  ~id,   ~counted_id,      
  "1",      1, 
  "2",      1,   
  "3",      3,   
  "4",      2, 
  "5",      1,   
  "6",      2, 
)

Upvotes: 1

Views: 122

Answers (4)

Ahorn
Ahorn

Reputation: 3876

Minimalist tidyverse solution:

library(tidyverse)
df_A %>% 
  separate_rows(id) %>% 
  count(id, name = "counted_id")


# A tibble: 6 x 2
  id    counted_id
  <chr>      <int>
1 1              1
2 2              1
3 3              3
4 4              2
5 5              1
6 6              2

Upvotes: 4

s_baldur
s_baldur

Reputation: 33488

In base R

id_vector <- unlist(strsplit(df_A$id, ", "))
data.frame(table(id_vector))

#   id_vector Freq
# 1         1    1
# 2         2    1
# 3         3    3
# 4         4    2
# 5         5    1
# 6         6    2

Using stringr::str_count():

ids <- as.character(1:7)
data.frame(
  id = ids,
  counted_id = sapply(ids, function(x) sum(str_count(df_A$id, x)))
)

#   id counted_id
# 1  1          1
# 2  2          1
# 3  3          3
# 4  4          2
# 5  5          1
# 6  6          2
# 7  7          0

Upvotes: 3

StupidWolf
StupidWolf

Reputation: 46908

You can do with by splitting your string on ", " and table:

as.data.frame(table(unlist(strsplit(df_A$id,", "))))

  Var1 Freq
1    1    1
2    2    1
3    3    3
4    4    2
5    5    1
6    6    2

Upvotes: 5

Ian Campbell
Ian Campbell

Reputation: 24790

Here's an approach with separate_rows from tidyr:

library(tidyr)
library(dplyr)
df_A %>%
  separate_rows(id, sep=", ") %>%
  group_by(id) %>%
  dplyr::summarize(counted_id = n())
## A tibble: 6 x 2
#  id    counted_id
#  <chr>      <int>
#1 1              1
#2 2              1
#3 3              3
#4 4              2
#5 5              1
#6 6              2

Upvotes: 4

Related Questions