Reputation: 2277
I have some text with different measures in that Im trying to exract with regex. a text can look something like this
Ipsum Lorem 3. 100x210 cm
Ipsum Lorem Lorem, 100x210 cm
I have got as far as I can extract the measurements, but when there is an int in the middle of the text ( like option 1) my regex fails.
([0-9x]+)(?:\^(-?\d+))?
Gets me
Match 1 : 100x210
Match 2 : 3
Match 3 : 100X210
Any suggestion on how I can skip match 2 and only regex INTxINT ?
Thanks in advance
Upvotes: 0
Views: 135
Reputation: 163477
Using a character class [0-9x]+
could possibly also match only xxx
or in this case, only 3
The optional group in your pattern could possibly also match 100x210^-2
, not sure if that is intended as \^
will match a caret.
To match both the lower and uppercase variant of x, you could use a character class [xX]
or make the regex case insensitive.
Using word boundaries \b
on the left and right:
\b\d+[xX]\d+\b
Or a more specific pattern using a capturing group, taking matching the cm
part afterwards:
\b(\d+[xX]\d+) cm\b
See a regex demo
Upvotes: 2
Reputation: 18631
You may use a regex like
\d+x\d+
See proof. It will match two substrings containing one or more digits separated with x
character.
Upvotes: 1