Reputation: 11188
I have a piece of data, retrieved from the database and containing information I need. Text is entered in a free form so it's written in many different ways. The only thing I know for sure is that I'm looking for the first number after a given string, but after that certain string (before the number) can be any text as well.
I tried this (where mytoken
is the string I know for sure its there) but this doesn't work.
/(mytoken|MYTOKEN)(.*)\d{1}/
/(mytoken|MYTOKEN)[a-zA-Z]+\d{1}/
/(mytoken|MYTOKEN)(.*)[0-9]/
/(mytoken|MYTOKEN)[a-zA-Z]+[0-9]/
Even mytoken
can be written in capitals, lowercase or a mix of capitals and lowercase character. Can the expression be case insensitive?
Upvotes: 5
Views: 9355
Reputation: 43169
You can use the opposite:
/(mytoken|MYTOKEN)(\D+)(\d)/
This says: mytoken, followed by anything not a number, followed by a number. The (lazy) dot-star-soup is not always your best bet. The desired number will be in $3
in this example.
Upvotes: 2
Reputation: 626870
You do not need any lazy matching since you want to match any number of non-digit symbols up to the first digit. It is better done with a \D*
:
/(mytoken)(\D*)(\d+)/i
See the regex demo
The pattern details:
(mytoken)
- Group 1 matching mytoken
(case insensitively, as there is a /i
modifier)(\D*)
- Group 2 matching zero or more characters other than a digit(\d+)
- Group 3 matching 1 or more digits.Note that \D
also matches newlines, .
needs a DOTALL modifier to match across newlines.
Upvotes: 7
Reputation: 1688
You need to use a lazy quantifier. You can do that by putting a question mark after the star quantifier in the regex: .*?
. Otherwise, the numbers will be matched by the dot operator until the last number, which will be matched by \d
.
Regex: /(mytoken|MYTOKEN)(.*?)\d/
Upvotes: 2