chethanjjj
chethanjjj

Reputation: 53

Using Regex match for sentences that do not contain a specific word

Trying to develop a regular expression to extract sentences that don't contain specific words. To keep it simple, IHere is a simple example:

Input: Sagittal scout images cervicothoracic : Mild-to-moderate multilevel spondylosis. Fracture present.

Desired Output: Fracture present.

Attempt #1

Regex: [^.]*(?!cervi(c|x))[^.]*\.

Actual Output: Sagittal scout images cervicothoracic : Mild-to-moderate multilevel spondylosis. Fracture present.

Attempt #2:

Regex: [^.]*[^(cervi(c|x))][^.]*\.

Actual Output: Sagittal scout images cervicothoracic : Mild-to-moderate multilevel spondylosis. Fracture present.

Can verify these results in https://regexr.com/

Upvotes: 3

Views: 997

Answers (1)

Ryszard Czech
Ryszard Czech

Reputation: 18631

Use

(?<![^.])\s*((?:(?!cervi[cx])[^.])*\.)

See proof

Explanation

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    [^.]                     any character except: '.'
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
--------------------------------------------------------------------------------
      (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
        cervi                    'cervi'
--------------------------------------------------------------------------------
        [cx]                     any character of: 'c', 'x'
--------------------------------------------------------------------------------
      )                        end of look-ahead
--------------------------------------------------------------------------------
      [^.]                     any character except: '.'
--------------------------------------------------------------------------------
    )*                       end of grouping
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
  )                        end of \1

Upvotes: 1

Related Questions