afshin
afshin

Reputation: 1833

matching a regex pattern with a negative look behind

I'm trying to write a regular expression in Python that detects patterns like 8 cc and 2.8 mm and avoids patterns with date like 12/26/2018 cc

The regex I tried for this pattern is: .*\d{1,}(?!/)(\s)(cc|mm|cm)

This is supposed to find patterns like 8 cc as long as it is not proceeded by a /.

This regex is finding all patterns and not avoiding the date. What is the problem with this regex?

Upvotes: 1

Views: 19

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626870

You may use

(?<!\d)(?<!\d/)\d+(?:\.\d+)?\s*(?:c[cm]|mm)\b

See the regex demo

Details

  • (?<!\d) - no digit immediately to the left is allowed
  • (?<!\d/) - no digit and / immediately to the left is allowed
  • \d+ - 1+ digits
  • (?:\.\d+)? - 1 or 0 occurrences of . and 1+ digits
  • \s* - 0+ whitespaces
  • (?:c[cm]|mm)\b - cc, cm or mm as whole words.

Python demo:

import re
rx = re.compile(r"(?<!\d)(?<!\d/)\d+(?:\.\d+)?\s*(?:c[cm]|mm)\b")
s = "I'm trying to write a regular expression in python that detects patterns like 8 cc and 2.8 mm  and avoids patterns with date like 12/26/2018 cc"
print( rx.findall(s) ) # => ['8 cc', '2.8 mm']

Upvotes: 1

Related Questions