Reputation: 563
Working in C# .Net 4.5
I need an expression that will look through a string and fail the match if the string has two or more capital characters anywhere in the string.
What I think should be the correct pattern is this:
(?![A-Z]{2,})\w
Note: tried both ?!
and ?<!
I got the opposite to work, search a string and return true if there are 2 or more cap's in a row and that pattern is as follows:
(?=[A-Z]{2,})\w
But I have to have this working off of the negative lookahead pattern.
From all the posts I've read this should be the correct way to do it, but it's not working for me.
I've read through questions such as :
C# regexp negative lookahead or Regex negative lookahead in c#
etc...
I don't want to list them all. But they all say more or less the same thing, just use the negative lookahead (?!)
Can anyone see what I'm doing wrong for this not to work?
Edit:
added some examples:
Advanced version:
Upvotes: 5
Views: 6248
Reputation:
You only need to FAIL a match if you are trying to match something.
What you are trying to match is the failure.
if [A-Z].*?[A-Z]
matches the string contains 2 cap letters.
If not two in a row, its this (multi-line) -> ^[^A-Z\r\n]*(?:[A-Z](?![A-Z])[^A-Z\r\n]*)*$
To match a non-empty string, just add a simple assertion.
^(?!$)[^A-Z\r\n]*(?:[A-Z](?![A-Z])[^A-Z\r\n]*)*$
For Unicode properties, use the \p{Lu}
form
^[^\p{Lu}\r\n]*(?:\p{Lu}(?!\p{Lu})[^\p{Lu}\r\n]*)*$
Input:
1.Hello - Should pass
2.HEllo - Should fail
3.heLLo - Should fail
4.HELLO - should fail
Advanced version:
1.Hello World - should pass
2.Hello WOrld - should fail
3.hello wORld - should fail
4.hello WORLD - should fail
Benchmark
Regex1: ^(?!.*\b\w*\p{Lu}\w*\p{Lu}).*$
Options: < ICU - m >
Completed iterations: 80 / 80 ( x 1000 )
Matches found per iteration: 5
Elapsed Time: 8.28 s, 8279.28 ms, 8279281 µs
Regex2: ^[^\p{Lu}\r\n]*(?:\p{Lu}(?!\p{Lu})[^\p{Lu}\r\n]*)*$
Options: < ICU - m >
Completed iterations: 80 / 80 ( x 1000 )
Matches found per iteration: 5
Elapsed Time: 3.88 s, 3875.04 ms, 3875039 µs
Upvotes: 1
Reputation: 626699
You can use the following regex:
^(?!.*\b\w*\p{Lu}\w*\p{Lu}).*$
See regex demo
It will match empty string, too, but you can use +
quantifier instead of *
to require at least 1 character.
To match a newline with this pattern, you will need to use RegexOptions.Singleline
modifier.
The negative lookahead (?!.*\b\w*\p{Lu}\w*\p{Lu})
anchored at the start of the string will fail the match once a word is found that starts with zero or more word characters, followed by a captital letter, again followed by zero or more word characters and then again an uppercase letter. You can shorten this with a limiting quantifier: ^(?!.*\b(?:\w*\p{Lu}){2}).*$
.
Upvotes: 3