onhalu
onhalu

Reputation: 745

Regex string match words pattern

I have this pattern with antibiotics

atb <- c("acefa","ampicilin","fortum")

And this data frame

    DF1 <- structure(list(ID = 1:3, Text = c("Person 1 take acefa and ampicilin", "fortum and acefa are antibiotics", "Person 3 has no antibiotics but ampicilin")), class = "data.frame", row.names = c(NA, -3L))

DF1
    
    ID                                      Text
    1           Person 1 take acefa and ampicilin
    2            fortum and acefa are antibiotics
    3   Person 3 has no antibiotics but ampicilin

And I would like to get this

DF1
        
    ID                                      Text        atb
    1           Person 1 take acefa and ampicilin      c("acefa","ampicilin")
    2            fortum and acefa are antibiotics      c("fortum","acefa")
    3   Person 3 has no antibiotics but ampicilin      ampicilin

I tried

DF1%>%
mutate(atb = regmatches(Text, regexec(atb, Text)))

and

DF1%>%
mutate(atb =  str_extract_all(Text, atb)))

But it does not work.

However, it works with grepl like this

DF1%>%
    mutate(atb =  grepl(atb, Text))) 

Could I get column with words from pattern?

Upvotes: 0

Views: 55

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269586

Set up the regular expression and use strapplyc:

library(dplyr)
library(gsubfn)

result <- DF1 %>% 
  mutate(atb = strapplyc(Text, paste(atb, collapse = "|")))

str(result$atb)

giving:

List of 3
 $ : chr [1:2] "acefa" "ampicilin"
 $ : chr [1:2] "fortum" "acefa"
 $ : chr "ampicilin"

Upvotes: 0

Related Questions