yeroc-sebrof
yeroc-sebrof

Reputation: 13

Trying to find characters that do not resemble Hex in the format '\x0a'

I am parsing a string that contains file magic numbers but the formatting is inconsistent. Some of the patterns are in Hex with the format '\x0a'(Where the string holds an escaped char so I apparently need to search for both \'s), others are the direct ASCII characters and the rest are somewhere in between.

I was hoping to make a Regular Expression to find the characters in a string that are not already Hex. I attempted the following search for Hex values with the inversion flag.

(?!\\\\x[0-9 a-f]{2})

This did not work as intended as it sees the x in the next character after the full match and matches to that.

>>> test = "\\x50K\\x03\\x04"
>>> re.search("(?!\\\\x[0-9 a-f]{2})" test)
<re.Match object; span(1, 1), match=''>

Without getting the positive results and inverting them myself I am not sure how to proceed.

Thanks!

Upvotes: 0

Views: 55

Answers (1)

nonForgivingJesus
nonForgivingJesus

Reputation: 593

You can replace hex values with nothing like this:re.sub(r'\\x[0-9 a-f]{2}','', your_line) and use what remains -- non-hex characters

Upvotes: 1

Related Questions