Reputation: 539
I have a file, what.dmp, which is 116 bytes long. And my python code looks like this:
import binascii
import re
import sys
print(sys.version)
needle = re.compile(b".{112}")
with open("what.dmp", "rb") as haystack:
chunk = haystack.read()
print("Read {0} bytes.".format(len(chunk)))
matches = needle.search(chunk)
if matches:
print(matches.start())
print(binascii.hexlify(matches.group(0)))
else:
print("No matches found.")
Running this code is fine:
C:\test>C:\Python33\python.exe test.py
3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AMD64)]
Read 116 bytes.
0
b'0101060001010600087e88758f4e8e75534589751df7897583548775e4bcf001e6d0f001cae3f001ccf7f0010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000090d91300000000002c003100eb6fb024'
However, change the regex from 112 to 113:
needle = re.compile(b".{113}")
And no match is found:
C:\test>C:\Python33\python.exe test.py
3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:06:53) [MSC v.1600 64 bit (AMD64)]
Read 116 bytes.
No matches found.
So the question is: why does the regex not match the 113th character. I haven't posted what.dmp because surely the contents are irrelevant?!
Many thanks!
Upvotes: 0
Views: 78
Reputation: 208545
There is a very good chance that byte 113 is equivalent to \n
, (10 in binary, 0a in hex). Try adding the re.DOTALL flag to your regex.
However as noted in comments, you probably don't need regular expressions for this.
Upvotes: 2