Waraba
Waraba

Reputation: 121

Regex edit of a binary file/string using python

I'm searching a way to do some string substitution in python 2.7 usin regex on a binary file.

s is a string I get from reading a binary file. It contains this sequence (hex ) :

' 00 00 03 00 00 01 4A 50 20 43 52 55 4E 43 48 20 32 20 45 51 00 F7 00 F0 '

here is the variable I use for finding the string to sub :

f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)

here is my sub :

f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s) 

now , while I got no error , my sub doesn't seem to change my string. Am I missing something ?

>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f99 = re.sub( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', br'\x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0', s)
>>> print f99
MThd
>>> print f99[0]
M
>>> print f01[0]
JP CRUNCH 2 EQ
>>> f01 = re.findall( br'\x03\x00\x00\x01(.*?)\xF7\x00\xF0', s)
>>> print f01[0]
JP CRUNCH 2 EQ

I would like to have my initial string changed to \x03\x00\x00\x01\x4B\x4B\x4B\x4B\xF7\x00\xF0 so I can store it to a file.

Upvotes: 2

Views: 3316

Answers (1)

jfs
jfs

Reputation: 414395

r'' literal prefix makes all slashes to be interpreted literally i.e., r'\x00' is not a single zero byte but 4 characters.

To avoid a random byte being interpreted as a regex meta-character you could use re.escape function.

To avoid repeating prefix, suffix in the replacement string you could use regex' lookahead, lookbehind:

>>> s
'\x00\x00\x03\x00\x00\x01JP CRUNCH 2 EQ\x00\xf7\x00\xf0'
>>> pre = b'\x03\x00\x00\x01'
>>> suff = b'\xf7\x00\xf0'
>>> re.sub(br'(?<=%s).*?(?=%s)' % tuple(map(re.escape, [pre, suff])), b'\x4b'*4, s)
'\x00\x00\x03\x00\x00\x01KKKK\xf7\x00\xf0'

You might need re.DOTALL regex flag to force . also to match a newline.

Upvotes: 2

Related Questions