Steffen Mandrup
Steffen Mandrup

Reputation: 139

Isolating sizes written in various ways while maintaining it in group 1

Wiktor helped me to isolate values when i had multiple sizes:

(?i)([a-z\d]+(?:[/-][a-z\d]+)?)[/-]([a-z\d]+(?:[/-][a-z\d]+)?)

However i was unable to look into how it could handle a single size all together

Test cases:

S1 - XXS/XS-S/M
S1 - 40-42-36-38
XXS/XS
40-42
40

With the above examples i am able to match all the relevant sizes, but not the last one "40"

How could i look into grabbing this value as well if its a single value. However it seems that won't be doable with the current.

So i have been wondering if it perhaps could be possible to grab the higlighted so that i always get the result of the first size in group 1

S1 - XXS/XS-S/M

S1 - 40-42-36-38

XXS/XS

40-42

40

Desired values are highlighted with bold.

Pretty much i just have to make sure that in the end i have two results which showcase each size - whatever is in front doesn't matter!

Really hope someone can help me on the right path here - the sizes are always separated by - or / or " - " and " / "

Upvotes: 0

Views: 30

Answers (1)

The fourth bird
The fourth bird

Reputation: 163477

If your string should match from the start of the string, one option could be using an anchor to assert the start of the string.

(?i)^(?:[a-z]+\d*(?:\s*-\s*[a-z\d]+[/-][a-z\d]+)?|\d+)\b
  • (?i) Inline modifier for case insensitive match
  • ^ Start of string
  • (?: Non capture group
    • [a-z]+\d* Match 1+ chars a-z and optional digits
    • (?: Non capture group
      • \s*-\s* Match - between optional whitespace chars
      • [a-z\d]+[/-][a-z\d]+ match either - or / between chars a-z or digits
    • )? Close group and make it optional
    • | Or
    • \d+ Match 1+ digits
  • )\b Close non capture group and a word boundary to prevent a partial match

Regex demo

Another idea if lookarounds are supported, is to use a word boundary on the left and isolate the specific parts that you want to match

(?i)\b(?:[a-z]\d\s*[-/]\s*(?:[a-z\d]+[/-][a-z\d]+)|(?<!-)(?:[a-z]+(?=/)|\d+))\b

Regex demo

Or using a capture group capturing what you want, and match optional non whitespace chars to prevent a partial match for the given examples.

(?i)\b([a-z\d]+(?:\s*-\s*[a-z\d]+[/-][a-z\d]+)?)\S*

Regex demo

Upvotes: 2

Related Questions