Reputation: 113
I have the following string need to match using regexp:
"The value is 0x0208 and the type is INTERNATION"
I want to get the digit 02 and 08, and store them into different two variable, I use the following regexp:
repexp "0x(\[0-9]+)\[^\\n]+INTERNALION" "The value is 0x0208 and the type is INTERNATION" whole first second
it can not get the second one, how to fix it?
Upvotes: 0
Views: 5319
Reputation: 386210
First, use curly braces for regular expressions, it makes them much easier to read because you don't have to use extra backslashes.
Second, use \d for digits to make the expression a little shorter, which also improves readability.
Searching for pairs of digits
In your description you say you want to search for two pairs of digits following 0x
. Here's a simple way to do that:
{0x(\d\d)(\d\d)}
This says "0x, followed by two digits that we capture, followed by two digits that we capture"
Searching for hexadecimal characters
Typically, hex numbers are preceeded by 0x
, which makes me think you are actually trying to parse a hex number. If that's true, you need to search for more than just digits. To match a hex digit you need to use [0-9a-f]
. Once a pattern gets slightly long (eg: [0-9a-f]
vs. \d
), you don't want to keep repeating it, so another way to say "two of these" is to use {2}
rather than repeating the pattern.
Putting that all together, to match two groups of two hex digits you could use something like this:
{0x([0-9a-f]{2})([0-9a-f]{2})}
Dealing with upper and lower case
Note that this pattern assumes the hex digits are lowercase. If your particular data might have uppercase letters there are at least four ways to handle that:
-nocase
option to the regexp
commandOf those, the last is likely the least obvious solution, so I'll present it here.
Tcl expressions can have a special sequence at the very start of the pattern that modifies how the regular expression works. In this case we want to tell it to ignore case. The way to do that is to add (?i)
at the start of the pattern:
{(?i)0x([0-9a-f]{2})([0-9a-f]{2})}
For more information on embedded options, see the metasyntax section of the re_syntax man page.
Upvotes: 5