Reputation: 2411
I'm trying to match strings that have a year at the end of them, but only when they're not enclosed in brackets. Negative lookaheads and lookbehinds don't seem to help.
Here's some example text. I only want the first two lines matched, and not the third.
Example one 2015
Example two 2017
Example three (2009)
If I use something like (?<!\(\d{4}\)$)
or (?!\(\d{4}\)$)
then I get 54 matches instead of the expected 2 (one for each of the first two lines).
What am I doing wrong?
Upvotes: 1
Views: 64
Reputation: 18611
Use more or less current centuries:
\b(?:19|20)\d\d$
Or, any four digits as a whole word at the end of string:
\b\d{4}$
See proof.
Explanation
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
19 '19'
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
20 '20'
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
\d digits (0-9)
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string
--------------------------------------------------------------------------------
\b word boundary
--------------------------------------------------------------------------------
\d{4} four digits
Upvotes: 2
Reputation: 8064
Try this:
^(.*[^\(]\d{4}[^\)]?)$
^
Start of line(
Start of capturing group.*
Anything zero or more times[^\(]
Anything but an opening parentheses\d{4}
Four digit date[^\)]?
Anything but a closing parentheses (optionally))
End of capturing group$
End of linehttps://regex101.com/r/zr2pfv/1
Upvotes: 2
Reputation: 1445
You could try matching on the next immediate character. For example:
\d{4}\s*$
This matches the lines containing exactly 4 digits as the last non-whitespace characters.
Upvotes: 2