Reputation: 4653
how do I get the 0
to be captured from the line K'0
everything else is being captured that I want.
this is my REGEX EXAMPLE
this is my regex
K'(?P<name1>81|61|64|44|86|678|41|49|33|685|1(?:33|45)?|\d{1,3})?\d+
K'0 <<<----adding the ? here |\d{1,3})?\d+ as want to pick up if there is only
K'93 <<<--- 1 number 2 number or 3 numbers (ie. K'0, K'93, K'935 )
K'935
K'8134567
K'81345678
K'6134516789
K'61345678
K'643456
K'646345678
K'1234567890
K'12345678901
K'1454567890 <<<--- want 145 returned and not 1
K'13345678901 <<<--- want 133 returned and not 1
K'3214567890123
K'32134567890123
K'3654567890123
K'8934567890123
K'6554567890123
I am interested in the digits after K'
I am looking to do this using regex but not sure if it can be done. What I want is:
if the number starts with 81 return 81
if the number starts with 61 return 61
...
if the number starts with something i am not interested in return other(or its first digits of 1-3)
The above criteria works:
but what I also want is:
if the fist digit is 1 then return 1 BUT
if the fist digit is 1 and the 2nd and 3rd digit are 45 return 145 and don't return just 1
if the fist digit is 1 and the 2nd and 3rd digit are 33 return 133 and don't return just 1
I presume I have to put something inside this part of the regex |(1)\d+|
Questions:
Does regex sort the data first?
Is the order of the regex search important to how it is implemented? i deally I do not want this.
Upvotes: 1
Views: 142
Reputation: 5395
I you accept digits like 001
in your named group, all you need is to change last \d+
to \d*
, it is not necessary to ad 0
as additional alternative(DEMO). I another situation, put 0
as alternative.
However you can also modify your regex to:
K'(?P<name1>0|8[16]|6[14]|4[149]|33|6(?:78|85)|1(?:33|45)?|\d{0,3})\d*
it will not change what it match, but should enhance it speed a little bit, by extracting a common pattern. Like when you search 685
by (678|685)
in simple alternative it will match first 6
, then will not match 7
, so it will backtrace to beginning again, and start match from 6
, and match 8
and 5
. With (6(?:78|85))
it will match 6
only once, and then will not match 7
, and directly try to match 8
.
Also, if you really want to match strings without numbers (just K'
) you can change \d{1,3})?\d*
to \d{0,3})\d*
as it is actually the same. The last option of alternative is \d({1,3}
from one to three digits, but the whole alternative is with ?
metacharacter (zero or one time), so even if alternative would not match any digits, regex will match if previous and further fragment of regex match. So it means the same, as from 0 to 3 digits (\d{0,3}
). With this, regex will firs try to match digits by alternative, and if there will be still some difits left, it will be matched by \d*
.
Upvotes: 0
Reputation: 626950
That is kind of tricky. The problem is that both parts of the regex (the part in parentheses and the \d+
) can match the same text. Making the first part optional (the named capture) you let the second \d+
have more priority, and it "eats" up your first digit since it must match at least 1 digit (due to +
quantifier), and the first group being optional does not have to capture any digits.
You can achieve the behavior you want with look-arounds set on the \d+
:
K'(?P<name1>0|678|685|1(?:33|45)?|81|61|64|44|86|41|49|33|\d{1,3})?(?:(?<!0)\d+|(?<=0)\d*)
See demo
The (?:(?<!0)\d+|(?<=0)\d*)
part means that if we have no 0
before, we can capture 1 or more digits (at least 1). If there is a zero, we should capture 0 or more digits (can be 0).
Upvotes: 0
Reputation: 31025
You can change your regex to:
K'(?P<name1>0|81|61|64|44|86|678|41|49|33|685|1(?:33|45)?|\d{1,3})?\d*
notice -----^ and also --^
Upvotes: 1