histelheim
histelheim

Reputation: 5088

Detecting sequencing using regexes

Imagine I have multiple character strings in a list like this:

[[1]]
 [1] "1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-" 
 [2] "-1-I2-1-TR-1-"                              
 [3] "-1-I2-1-FA-1-I3-1-"                         
 [4] "-1-FA-1-FA-1-NR-1-I3-1-I2-1-TR-1-"          
 [5] "-1-I2-1-"                                   
 [6] "-1-I2-1-FA-1-I2-1-"                         
 [7] "-1-I3-1-FA-1-QU-1-"                         
 [8] "-1-I2-1-I2-1-I2-1-NR-1-I2-1-I2-1-NR-1-"     
 [9] "-1-I2-1-"                                   
[10] "-1-NR-1-I3-1-QU-1-I2-1-I3-1-QU-1-NR-1-I2-1-"
[11] "-1-NR-1-QU-1-QU-1-I2-1-"

I want to use a regex to detect the particular strings where a certain substring precedes another substring, but not necessarily directly preceding the other substring.

For example, let's say that we are looking for FA preceding EX. This would need to match 1 in the list. Even though FA has -1-I2-1-I2-1-I2-1- between itself and EX, the FA still occurs before the EX, hence a match is expected.

How can a generic regex be defined that identifies strings where certain substrings appear before another substring in this manner?

Upvotes: 2

Views: 61

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174706

You may use grep.

x <- c("1-FA-1-I2-1-I2-1-I2-1-EX-1-I2-1-I3-1-FA-1-" ,"-1-I2-1-TR-1-")
grepl("FA.*EX", x)
#[1]  TRUE FALSE
grep("FA.*EX", x)
#[1] 1

Upvotes: 8

Related Questions