Reputation: 121
I'm searching a way to do some string substitution in python 2.7 usin regex on a binary file.
s is a string I get from reading a binary file. It contains this sequence (hex ) :
' 00 00 03 00 00 01 4A 50 20 43 52 55 4E 43 48 20 32 20 45 51 00 F7 00 F0 '
here is the variable I use for finding the string to sub :
f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
here is my sub :
f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s)
now , while I got no error , my sub doesn't seem to change my string. Am I missing something ?
>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s)
>>> print f99
MThd
>>> print f99[0]
M
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ
I would like to have my initial string changed to \x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0 so I can store it to a file.
Upvotes: 2
Views: 3316
Reputation: 414395
r''
literal prefix makes all slashes to be interpreted literally i.e., r'\x00'
is not a single zero byte but 4 characters.
To avoid a random byte being interpreted as a regex meta-character you could use re.escape
function.
To avoid repeating prefix, suffix in the replacement string you could use regex' lookahead, lookbehind:
>>> s
'\x00\x00\x03\x00\x00\x01JP CRUNCH 2 EQ\x00\xf7\x00\xf0'
>>> pre = b'\x03\x00\x00\x01'
>>> suff = b'\xf7\x00\xf0'
>>> re.sub(br'(?<=%s).*?(?=%s)' % tuple(map(re.escape, [pre, suff])), b'\x4b'*4, s)
'\x00\x00\x03\x00\x00\x01KKKK\xf7\x00\xf0'
You might need re.DOTALL
regex flag to force .
also to match a newline.
Upvotes: 2