Reputation: 2089
I would like to know why the regex below is accepting 1.
"((^G0{0,2}$)|(^T|^R0{0,2}$)){0,5}"
I would like my regex to accept the sequences G00, G01, T00, R00 any number of times. At the moment I'm only trying to have G00, T00, R00 any number of times, but my regex is also accepting 1 as input. The regex should also accept G, G0, T, T0, R, R0, but the goal is to have a sequence of 3 characters.
Upvotes: 0
Views: 98
Reputation: 31184
Right now, your regex matches an empty string, and will find nothing at all.
(...){0, 5}
can match ...
0 times, thus finding matches on every string.
Your specific requirement(to match only those 4 inputs) would probably want a regex like this
^(?:G01)|[GRT]00$
http://rubular.com/r/BrlxDfGkdf
if you want to be able to get multiple matches per line, than just leave off the anchors: ^
and $
(?:G01)|[GRT]00
http://rubular.com/r/3ODzf08eT5
Upvotes: 1
Reputation: 316
I think because you allow 0-5 repetitions of this, anything can match it 0 times. Why not force it to match at least once?
"((^G0{0,2}$)|(^T|^R0{0,2}$))+"
Upvotes: 0
Reputation: 198324
The regexp is matching zero repetitions of the alternation, with match length 0. (If you repeat it 0 times, the ^
anchor does not fire, so it can match anywhere.) You should extract the anchors outside the repetition. Something like...
^(?:[GTR]\d{0,2})+$
- start
--- -- any number of repetitions (1+) of
----- any of "G", "T", or "R"
------- 0-2 digits
- end
If your main sequence is repeating, capture groups don't make any sense, so I've stripped them.
Upvotes: 2