Reputation: 1148
In attempting some number validation, there is one case I want to exclude a number - if it contains a hyphen before the four digits.
To simplify my regular expression, let's only worry about those 4 digits.
Since I'm using JavaScript, I can't use lookbehinds.
In an attempt to use a negative lookahead to match anything not containing a hyphen, I came up with:
((?!-).)\d{4}
My test data is below, bolded are the matches:
2014
1106 **2014** **9899**
**11500**
234-233-2014
234-234-1100
-1100
Where my expectation is that 2014, 1106, 2014 and 9989 match, whereas 11500 does not. I know the issue is with the period is due to the fact that it matches anything except for line breaks. I also am trying to consider line breaks as I apply the word boundaries to my regular expression.
Might there be a better solution where I can match only a 4 digit number not followed by a hyphen, or simply exclude any matches if they are preceded by a hyphen?
Upvotes: 0
Views: 105
Reputation: 174806
Through regex only,
(?:(?!\b-\b|-\b)(?:.|^))\b(\d{4})\b
Get the numbers from group index 1.
And your js code would be,
> console.log(text.match(/(?:(?!\b-\b|-\b)(?:.|^))\b(\d{4})\b/g));
[ '2014', ' 1106', ' 2014', ' 9899' ]
OR
> function getMatches(string, regex, index) {
... index || (index = 1);
... var matches = [];
... var match;
... while (match = regex.exec(string)) {
..... matches.push(match[index]);
..... }
... console.log(matches);
... }
undefined
> var matches = getMatches(text, re, 1);
[ '2014', '1106', '2014', '9899' ]
Code stolen from here :-)
Upvotes: 2
Reputation: 9245
This is a workaround in JavaScript using replace()
var text = "2014 \
1106 2014 9899 \
11500 \
\
234-233-2014 \
234-234-1100 \
-1100";
var a = [];
text.replace(/(-?\b\d{4}\b)/g, function(m){
if(!m.match(/-/g)) a.push(m);
});
console.log(a);
Output:
["2014", "1106", "2014", "9899"]
Previous attempt (using look-behind which isn't supported in JavaScript)
/(?<!-)\b(\d{4})\b/g
Upvotes: 0
Reputation: 2690
Match either four digits that are the beginning of a line, or four digits that don't come after a hyphen:
/[^-]\b\d{4}\b|^\b\d{4}\b/
Upvotes: 0
Reputation: 4069
Although this doubles up your searches, you can do a lookahead with both a positive and negative component to it:
(?=(?!-)\d{4})\b\d{4,}\b
This regex101 example doesn't capture the numbers, where this regex101 example does.
Upvotes: 1