Steve
Steve

Reputation: 333

R Using grep() to extract characters

In the link below is a list (of warning messages):

https://drive.google.com/file/d/1pz-jSkqU5nG_ipaezFCvWNI6WHgekAdE/view?usp=sharing

And I am trying to get:

  1. Only at the start of the string,
  2. Where this pattern exists "x = ECC ",
  3. Retrieve only the "ECC" portion.

I was successful on this test:

regex.com

But R doesn't work with this code:

grep("(?<=\\A\"x\\s=\\s')[A-Z]*", names(warnings), value = TRUE, perl = TRUE)

#> character(0)

What's not working?

Upvotes: 0

Views: 95

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389105

In this data you have additional spaces in the text (Eg - "x = 'GEN ') hence the pattern does not match. We may switch to str_match here :

stringr::str_match(names(warnings), "x\\s=\\s'(\\w+)\\s+'")[, 2]
# [1] "ECC"  "ECC"  "ECOM" "ECOM" "ETX"  "ETX"  NA     NA     NA     "FEI" 
#[11] "FEI"  "GEN"  "GEN"  NA     NA     NA     "SAND" "SAND" NA     NA    
#[21] NA     "STAR" "STAR" NA     NA     NA    

Upvotes: 1

Related Questions