Reputation: 686
I am trying to capture some data from a piece of hardware I'm developing through one of cypress' fx2lp chips. I used cypress' software to record a sample of my data stream to a file, which I am trying to read with python. However, when I read it, I'm getting some interesting output that I'm not sure how to interpret.
I am opening the file like this:
f = open("testdata_5Aug2014.dat","rb")
Then I read the data in various sized chunks, similar to this:
f.read(100)
Typically, the result of the above line (and what I want to see) is something like this:
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
But I sometimes get returns that include 't's and '?'s thrown in there like this:
'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14K\x01?\x00\xff??\x00\xff??\x00\xff??\x00\xff?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
This is a problem, because when I use struct.unpack to parse this out, it won't return any of those bytes with the special characters appended.
So my question is: What are those symbols? How did they get there? and How do I remove them or deal with them?
Upvotes: 1
Views: 591
Reputation: 2833
You're reading binary data from a file, but f.read
returns that data as a string. When you print that string, it's interpreting those bytes as characters. However, not every byte value maps to a displayable character, so some bytes are shown as escape sequences: \x
followed by two hexadecimal digits. For example, 0 shows up as \x00
and 255 shows up as \xff
.
Some values do map to characters, such as 63 mapping to '?' and 116 mapping to 't'. The ord
and chr
functions can be used to fetch the numerical value of a character, and the character mapping for a number, respectively, so ord('t')
returns 116 and chr(63)
returns '?'
.
Either way, no matter how it's displayed, your data should be fine, and struct.unpack
should be able to work with it as usual.
Upvotes: 1