Reputation: 841
I have a question in a google-form for which I want to set a response validation to match "in 25 words or fewer".
The regex I've tried is ^(\b.+){1,25}$
but that isn't working: more than 25 words in one paragraph is validating, and 2 ten word paragraphs is invalidating.
I do want to allow multiple lines/paragraphs because people are people and they'll just get confused if it were not allowed.
These should pass:
These should fail:
Suggestions?
Upvotes: 1
Views: 2245
Reputation: 664548
You're looking for
/^(?:\s*\S+(?:\s+\S+){0,24})?\s*$/
which avoids catastrophic backtracking by always matching exactly one whole word in the repetition. It's (\s+\S+){0,25}
with the first repetition factored out to allow any whitespace, including none, (*
) instead of at least one (+
).
You could also use the easier to read (\s*\S+){0,25}
with a negative lookahead to ensure matching whole words:
/^(?:\s*\S+(?!\S)){0,25}\s*$/
Alternatively, possessive quantifiers ({0,25}+
) are the best solution if your regex engine supports them.
And of course you can swap out \s
/\S
for \W
/\w
if you desire, and then also use a word boundary instead of the lookahead:
/^(?:\W*\w+\b){0,25}\W*$/
Upvotes: 3
Reputation: 9786
Assuming ^
and $
are ok:
^(([^\s]+)\s?){1,25}$
it looks like the trailing \s?
was triggering the catastrophic backtracking, rewriting without that makes it a bit longer as the first word and the next 24 are matched separately:
^[^\s]+(\s([^\s]+)){0,24}\s?$
(the \s pattern matches whitespace)
Upvotes: 0