Erics
Erics

Reputation: 841

Regex for "25 words or fewer"?

I have a question in a google-form for which I want to set a response validation to match "in 25 words or fewer".

The regex I've tried is ^(\b.+){1,25}$ but that isn't working: more than 25 words in one paragraph is validating, and 2 ten word paragraphs is invalidating.

I do want to allow multiple lines/paragraphs because people are people and they'll just get confused if it were not allowed.

These should pass:

These should fail:

Suggestions?

Upvotes: 1

Views: 2245

Answers (2)

Bergi
Bergi

Reputation: 664548

You're looking for

/^(?:\s*\S+(?:\s+\S+){0,24})?\s*$/

which avoids catastrophic backtracking by always matching exactly one whole word in the repetition. It's (\s+\S+){0,25} with the first repetition factored out to allow any whitespace, including none, (*) instead of at least one (+).

You could also use the easier to read (\s*\S+){0,25} with a negative lookahead to ensure matching whole words:

/^(?:\s*\S+(?!\S)){0,25}\s*$/

Alternatively, possessive quantifiers ({0,25}+) are the best solution if your regex engine supports them.

And of course you can swap out \s/\S for \W/\w if you desire, and then also use a word boundary instead of the lookahead:

/^(?:\W*\w+\b){0,25}\W*$/

Upvotes: 3

gordy
gordy

Reputation: 9786

Assuming ^ and $ are ok:

^(([^\s]+)\s?){1,25}$

it looks like the trailing \s? was triggering the catastrophic backtracking, rewriting without that makes it a bit longer as the first word and the next 24 are matched separately:

^[^\s]+(\s([^\s]+)){0,24}\s?$

(the \s pattern matches whitespace)

Upvotes: 0

Related Questions