Edward Wu
Edward Wu

Reputation: 495

Find the first instance of a character, then work backwards

I'm having trouble wrapping my head around how to find the 1st instance of something, then work "backwards" using Regex...

I have some strings where a product code is combined with a product name. Unfortunately, the delimiter (a dash) separating the product code from the product code is the same.

The product code can have different numbers of delimiters. Some product codes have one dash, while others might have multiple dashes.

But, I know that all product names have a space.

So taking these two strings, for example:

I'd like to do the equivalent of:

So what I want to extract from the above 2 examples: - "ABC-ER-015-30" - "ABC-1234"

This works if there are no dashes in the Item Name:

(.*)-

But if there is a dash in the Item Name, it captures part of the Item Name.

I feel like there's something really simple that I'm missing.

Upvotes: 0

Views: 674

Answers (2)

The fourth bird
The fourth bird

Reputation: 163362

You could use match 1+ uppercase chars and repeat matching a dash and 1+ uppercase chars.

As you know that all product names have a space, you could add a positive lookahead asserting a dash, 1+ non whitespace chars followed by a space.

^[A-Z0-9]+(?:-[A-Z0-9]+)+(?=-\S+ )
  • ^ Start of string
  • [A-Z0-9]+ Match 1+ times A-Z0-9
  • (?:-[A-Z0-9]+)+ Repeat 1+ times matching - and A-Z0-9
  • (?=-\S+ ) Positive lookahead, assert -, 1+ non whitspace chars and a space

Regex demo

Another option is to make use of a capturing group instead of a positive lookahead

^([A-Z0-9]+(?:-[A-Z0-9]+)+)-\S+ 

Regex demo

Upvotes: 2

41686d6564
41686d6564

Reputation: 19641

You may use the following pattern:

^(?:[A-Z0-9]+-?)+?(?=-\S+[ ])

Demo.

Breakdown:

^               # Beginning of the string.
(?:             # Start of a non-capturing group.
    [A-Z0-9]+   # Any uppercase letter or a digit repeated one or more times.
    -?          # An optional hyphen characters.
)               # End of the non-capturing group.
(?=             # Start of a positive Lookahead.
    -           # Matches a hyphen character literally.
    \S+         # Any non-whitespace character repeated one or more times.
    [ ]         # Matches a space character.
)               # End of the lookahead.

References:

Upvotes: 1

Related Questions