Reputation: 7362
I'd like a regular expression to match a string only if it contains a character that occurs a predefined number of times.
For example: I want to match all strings that contain the character "_" 3 times;
So
"a_b_c_d" would pass
"a_b" would fail
"a_b_c_d_e" would fail
Does someone know a simple regular expression that would satisfy this?
Thank you
Upvotes: 4
Views: 10161
Reputation: 7704
Elaborating on Rado's answer, which is so far the most polyvalent but could be a pain to write if there are more occurrences to match :
^([^_]*_){3}[^_]*$
It will match entire strings (from the beginning ^
to the end $
) in which there are exactly 3 ({3}
) times the pattern consisting of 0 or more (*
) times any character not being underscore ([^_]
) and one underscore (_
), the whole being followed by 0 ore more times any character other than underscore ([^_]*
, again).
Of course one could alternatively group the other way round, as in our case the pattern is symmetric :
^[^_]*(_[^_]*){3}$
Upvotes: 1
Reputation: 2145
If you're examples are the only possibilities (like a_b_c_...), then the others are fine, but I wrote one that will handle some other possibilities. Such as:
a__b_adf
a_b_asfdasdfasfdasdfasf_asdfasfd
___
_a_b_b
Etc.
Here's my regex.
\b(_[^_]*|[^_]*_|_){3}\b
Upvotes: 0
Reputation: 56915
For your example, you could do:
\b[a-z]*(_[a-z]*){3}[a-z]*\b
(with an ignore case flag).
You can play with it here
It says "match 0 or more letters, followed by '_[a-z]*' exactly three times, followed by 0 or more letters". The \b
means "word boundary", ie "match a whole word".
Since I've used '*' this will match if there are exactly three "_" in the word regardless of whether it appears at the start or end of the word - you can modify it otherwise.
Also, I've assumed you want to match all words in a string with exactly three "_" in it.
That means the string "a_b a_b_c_d" would say that "a_b_c_d" passed (but "a_b" fails).
If you mean that globally across the entire string you only want three "_" to appear, then use:
^[^_]*(_[^_]*){3}[^_]*$
This anchors the regex at the start of the string and goes to the end, making sure there are only three occurences of "_" in it.
Upvotes: 5