kjgregory
kjgregory

Reputation: 686

strange return from python's f.read

I am trying to capture some data from a piece of hardware I'm developing through one of cypress' fx2lp chips. I used cypress' software to record a sample of my data stream to a file, which I am trying to read with python. However, when I read it, I'm getting some interesting output that I'm not sure how to interpret.

I am opening the file like this:

f = open("testdata_5Aug2014.dat","rb")

Then I read the data in various sized chunks, similar to this:

f.read(100)

Typically, the result of the above line (and what I want to see) is something like this:

'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x05\x12\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

But I sometimes get returns that include 't's and '?'s thrown in there like this:

'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14t\x14K\x01?\x00\xff??\x00\xff??\x00\xff??\x00\xff?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

This is a problem, because when I use struct.unpack to parse this out, it won't return any of those bytes with the special characters appended.

So my question is: What are those symbols? How did they get there? and How do I remove them or deal with them?

Upvotes: 1

Views: 591

Answers (1)

Pieter Witvoet
Pieter Witvoet

Reputation: 2833

You're reading binary data from a file, but f.read returns that data as a string. When you print that string, it's interpreting those bytes as characters. However, not every byte value maps to a displayable character, so some bytes are shown as escape sequences: \x followed by two hexadecimal digits. For example, 0 shows up as \x00 and 255 shows up as \xff.

Some values do map to characters, such as 63 mapping to '?' and 116 mapping to 't'. The ord and chr functions can be used to fetch the numerical value of a character, and the character mapping for a number, respectively, so ord('t') returns 116 and chr(63) returns '?'.

Either way, no matter how it's displayed, your data should be fine, and struct.unpack should be able to work with it as usual.

Upvotes: 1

Related Questions