Sneazel
Sneazel

Reputation: 11

RegEx: Get every word until last 4 words

I have strings like

  1. wwww-wwww-wwww
  2. wwww-www-ww-ww

Many w separated with -
But it's not regular wwww-wwww, it could be w-w-w-w as well

I try to find a regex that capture every word until the last 4 words.
So the result for example 1 would be the first 8w's (wwww-wwww)
For 2nd example the first 5w's (wwww-w)

Is it possible to do this in regex? I have something like this right now:

^\w*(?=\w{4}$)

or maybe

[^-]*(?=\w{4}$)

I have 2 problems with my "solutions":

  1. the last 4 words will not be captured for example 2. They are interrupted by the -

  2. the words before the last 4 will not be captured. They are interrupted by the -.

Upvotes: 1

Views: 65

Answers (1)

Tim Pietzcker
Tim Pietzcker

Reputation: 336428

Yes, it's possible with a slightly more sophisticated lookahead assertion:

/\w(?=(?:-*\w){4,}$)/x

Explanation:

/       # Start of regex
\w      # Match a "word" character
(?=     # only if the following can be matched afterwards:
 (?:    # (Start of capturing group)
  -*    #  - zero or more separators
  \w    #  - exactly one word character
 ){4,}  # (End of capturing group), repeated 4 or more times.
 $      # Then make sure we've reached the end of the string.
)       # End of lookahead assertion/x

Test it live on regex101.com.

Upvotes: 1

Related Questions