PhD Student
PhD Student

Reputation: 83

How to Count String in a Column

I am a novice in R and have a data with two fields. I need to count the number of times the first field element appears in the second field. The second field can contain more one element due to which the below code isn't giving the right answer. Please tell how to modify this or what function can I use here. The count for A1 should be 3 but it is coming as 1 since the presence of A1 in A1;A2 and A3;A1 are not recognized in this code. Thanks.

df0 <- data.frame (ID  = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
                  Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)

n1 <- nrow(df0)

df1 = data.frame(matrix(
  vector(), 0, 2, dimnames=list(c(), c("ID","Count"))),
  stringsAsFactors=F)

for (i in 1:n1){
  
  id <- df0$ID[i]
  df2 <- filter(df0, Refer == id) # This assumes only a single ID can be there in Refer
  n2 <- nrow(df2) 
  df1[i,1] <- id
  df1[i,2] <- n2

}

Upvotes: 0

Views: 77

Answers (3)

TarJae
TarJae

Reputation: 78927

Here is a tidyverse solution:

df0 %>% 
  separate_rows(Refer) %>% 
  mutate(x = str_detect(Refer, pattern)) %>%
  filter(x == TRUE) %>% 
  count(Refer)
  Refer     n
  <chr> <int>
1 A1        3
2 A2        3
3 A3        1
4 C1        1

Upvotes: 1

jay.sf
jay.sf

Reputation: 72813

You might strsplit "Refer" at ; and unlist it. Next create a factor out of it using "Id" as levels and simply table the result.

table(factor(unlist(strsplit(df0$Refer, ';')), levels=df0$ID))
# A1 A2 A3 A4 B1 C1 D1 
#  3  3  1  0  0  1  0 

Upvotes: 0

Aleksandr
Aleksandr

Reputation: 1914

You are almost there. Although, you should use grepl() instead of exact filtering Refer == id.

library(dplyr)
df0 <- data.frame (ID  = c("A1", "A2", "A3", "A4", "B1", "C1", "D1"),
                   Refer = c(" ", " ", "A1", "A1;A2", "A3;A1", "A2","A2;C1")
)


result <- lapply(df0$ID, function(x){
  n = df0 %>% filter(grepl(x, Refer)) %>% nrow
  data.frame(ID = x, count = n)
}) %>% 
  bind_rows

Upvotes: 1

Related Questions