user11015000
user11015000

Reputation: 159

Search for multiple occurrences of substring within string

Below is my dataset. I am using the function str_detect() on the key column like so:

str_detect(mydata$Key, 'R')

I want to be able to search for strings that contain 2 R's. Obviously in the below example I could just search for R002R009 but I do not always know the numbers attached to the R's so I just want to search for strings with 2 R's.

I need to be able to use it inside an ifelse statement

 mydata[1:3]
           IDENTIFIER  DATE_TIME         X-VALUE     Y-VALUE      Key
    1      214461707   1/04/2019 8:25           1         -3       A001
    2      214461789   1/04/2019 10:16          1         -2       R001
    3      214461789   1/04/2019 10:16          1         -5       R002R009

Upvotes: 0

Views: 1015

Answers (3)

akrun
akrun

Reputation: 886948

We can use subset from base R

subset(mydata, nchar(gsub('[^R]+', '', Key)) == 2)

Upvotes: 1

Dr.D
Dr.D

Reputation: 23

You can use a regular expression with str_detect.

mydata %>% 
  filter(str_detect(string = Key, pattern = "R.*R"))

results in:

         id FACILITY_ID DATE_TIME X.VALUE Y.VALUE      Key
3 214461789   1/04/2019     10:16       1      -5 R002R009

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

You can use str_count to count number of occurrences of a letter and use it in filter.

library(dplyr)
library(stringr)

mydata %>% filter(str_count(Key, 'R') == 2)

#   FACILITY_ID      DATE_TIME XVALUE YVALUE      Key
#3   214461789 1/04/201910:16      1     -5 R002R009

Upvotes: 3

Related Questions