Surendranadh Nune
Surendranadh Nune

Reputation: 33

Regex to ignore sentences/words that are in double quote

I was trying to build a regex that ignores strings in double quotes. I was not able to ignore the strings if there are spaces in double quotes. Here is the regex I was to build so far,

(?<![\S"])([^"\s]+)(?![\S"])

https://regex101.com/r/eTgyWe/1

Upvotes: 2

Views: 266

Answers (1)

Ryszard Czech
Ryszard Czech

Reputation: 18611

Use

"[^"\\]*(?:\\.[^"\\]*)*"(*SKIP)(*F)|(?<!\S)([^"\s]+)(?!\S)

See proof.

Explanation

--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
  [^"\\]*                  any character except: '"', '\\' (0 or more
                           times (matching the most amount possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \\                       '\'
--------------------------------------------------------------------------------
    .                        any character except \n
--------------------------------------------------------------------------------
    [^"\\]*                  any character except: '"', '\\' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  "                        '"'
--------------------------------------------------------------------------------
  (*SKIP)(*F)             omit the match and skip it, proceed to search for next
                          match from the failed location
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    [^"\s]+                  any character except: '"', whitespace
                             (\n, \r, \t, \f, and " ") (1 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of look-ahead

Upvotes: 1

Related Questions