sideroxylon
sideroxylon

Reputation: 4416

Regex for alphanumeric string with a maximum number of spaces

I need a JS regex to match a string based only on a known first and last sub-string and number of spaces - and I don't care about the length or the nature of what is between the first and last sub-strings (other than the exact number of spaces).

The following is a possible start string (from which I get the first and last sub-strings and the number of spaces):

cat apple dog mouse

From this, I now know the string starts with cat, ends with mouse and contains exactly 3 spaces (they could be be anywhere between the ends, but they will not be consecutive).

The string I need to match against might be:

catfish mouse mouse dormouse mouse mouse

or cat mouse mouse mouse mouse mouse

So, what I need to match would be, in the first case catfish mouse mouse dormouse, and in the second case cat mouse mouse mouse - in both cases a string starting with cat, ending with mouse and containing exactly 3 spaces. At the moment, all my attempts match the entire sample string above, not just from cat to the third mouse. Here is my latest failure:

cat(?:(?![\s]{4,}).*)mouse

I have a strong suspicion I'm overthinking this - but thanks for any suggestions.

Upvotes: 1

Views: 104

Answers (2)

Dal
Dal

Reputation: 164

I'm not anywhere near as good at regex as nu11p01n73R is but I tried for fun:

/cat[^\s]*(\s[^\s]+){2}\s[^\s]*?mouse/

It is ugly but it worked when I tested it

looks for 'cat'

then runs through any non whitespace

do twice {

then looks for a space

then looks for at least one char of non whitespace

}

then runs through any non whitespace

until it finds mouse

Upvotes: 0

nu11p01n73R
nu11p01n73R

Reputation: 26667

You can write a regex without look aheads do do this.

Example

\bcat(?:[^\s]*\s){3}[^\s]*mouse\b

Regex Demo


What it does?

  • \b Matches a word boundary. This ensures that it doesn't match strings that end as mousexyz
  • cat Matches cat at the start of the string
  • (?:[^\s]*\s){3}
    • [^\s]* Matches anything other than a space. So this one matches a single word and the following \s matches the space after the word.
    • {3} Makes sure that the single word with space is repeated 3 times.
  • [^\s]* Matches any character other than space after the 3 spaces.
  • mouse Matches mouse at the end of the string

Why doesn't cat(?:(?![\s]{4,}).*)mouse work?`

  • (?![\s]{4,}) This negative lookahead, will check if cat is not immediately followed by 4 spaces. Which is true so it matches all the input strings.

Upvotes: 1

Related Questions