Reputation: 27595
I have a file consisting in three parts:
29
(group separator);I want to get one xml string from the first part, and the numeric stream (to be parsed with struct.unpack
or array.fromfile
).
Should I create an empty string and add to it reading the file byte by byte until I find the separator, like shown here?
Or is there a way to read everything and use something like xmlstring = open('file.dat', 'rb').read().split(chr(29))[0]
(which by the way doesn't work) ?
EDIT: this is what I see using a hex editor: the separator is there (selected byte)
Upvotes: 0
Views: 4600
Reputation: 32610
Your attempt at searching for the value chr(29)
didn't work because in that expression 29
is a value in decimal notation. The value you got from your hex editor however is displayed in hex, so it's 0x29
(or 41
in decimal).
You can simply do the conversion in Python - 0xnn
is just another notation for entering an integer literal:
>>> 0x29
41
You can then use str.partition
to split the data into your respective parts:
with open('file.dat', 'rb') as infile:
data = infile.read()
xml, sep, binary_data = data.partition(SEP)
Demonstration:
import random
SEP = chr(0x29)
with open('file.dat', 'wb') as outfile:
outfile.write("<doc></doc>")
outfile.write(SEP)
data = ''.join(chr(random.randint(0, 255)) for i in range(1024))
outfile.write(data)
with open('file.dat', 'rb') as infile:
data = infile.read()
xml, sep, binary_data = data.partition(SEP)
print xml
print len(binary_data)
Output:
<doc></doc>
1024
Upvotes: 1
Reputation: 2114
Make sure you are reading the file in before trying to split it. In your code, your don't have a .read()
with open('file.dat', 'rb') as f:
file = f.read()
if chr(29) in file:
xmlstring = file.split(chr(29))[0]
elif hex(29) in file:
xmlstring = file.split(hex(29))[0]
else:
xmlstring = '\x1d not found!'
Ensure that a ASCII 29 char exists in your file (\x1d
)
Upvotes: 1
Reputation: 798744
mmap the file, search for the 29, create a buffer
or memoryview
from the first part to feed to the parser, and pass the rest through struct
.
Upvotes: 1