To compare two pointer locations in Python

Code

import sys
import os

fp = open("/home/masi/r3.raw", "rb")

try:
    events = []
    while aBuf[:4] != b'\xFA\xFA\xFA\xFA':
        aBuf = fp.read(4)
        events.append(aBuf)         
        if aBuf == os.SEEK_END:
            # pointer cannot be outside of file so minus 144
            fileEnding = aBuf[os.SEEK_END - 144 : os.SEEK_END]
except:
    print "File end at position : ", fp.tell()
    import traceback
    traceback.print_exc()

finally:
    fp.close()

where I know that the following is never true

        if aBuf == os.SEEK_END:
            # pointer cannot be outside of file so minus 144
            fileEnding = aBuf[os.SEEK_END - 144 : os.SEEK_END]

I am comparing the pointer with the end pointer of the file, at least I am expecting so but it does not seem to correct.

Improved Code from skrrgwasme and martineau's contributions

import sys
import os
import struct
import binascii

file_name = "/home/masi/r.raw"
file_size = os.path.getsize(file_name)
print "File size is : ", file_size
read_size = 4
read_count = 0

aBuf = b'\x00\x00\x00\x00' # don't forget to create your variables before you try to read from them
fileEnding = ""
fp = open(file_name, "rb")

try:
    aBuf = fp.read(read_size)
    read_count += read_size
    event_starts = []
    event_ends = []
    event_starts.append(read_count)
    while aBuf and read_count < file_size:
        if aBuf[:read_size] == b'\xFA\xFA\xFA\xFA':
            event_ends.append(read_count)
            if read_count + 1 < file_size: event_starts.append(read_count + 1)

        aBuf = fp.read(read_size)
        read_count += read_size
        print "RC ", read_count, ", remaining: ", 1.0-float(read_count)/file_size, "%"

    if read_count >= file_size: break

except:
    print "File end at position : ", fp.tell()
    import traceback
    traceback.print_exc()

finally:
    # store to partial index of postgres database: event pointers
    fp.close()

How can you compare location of two pointers?

Upvotes: 1

Views: 671

Answers (1)

skrrgwasme
skrrgwasme

Reputation: 9633

If you take a look at the Python source code for the os module, you'll see that os.SEEK_END isn't automatically set to the size of your file. It's just a constant that is set equal to the integer 2. It is intended to be used as a parameter for the lseek() function.

You need to get the file size in bytes first, then compare your file pointer to that. You can use os.path.getsize(path) to get your file size in bytes. Your comparison was never true because you were reading four bytes at a time, so your file pointer skipped from byte 0 to byte 4, passing over 2, which is the value of os.SEEK_END.

Suggested code:

import sys
import os

file_name = "/home/masi/r3.raw"
file_size = os.path.getsize(file_name)
read_size = 4
read_count = 0
# you could use fp.tell() in the loop instead of manually incrementing
# your own count of the file position instead, but this will avoid a lot of
# extra fp.tell() calls in the loop

aBuf = b'\x00\x00\x00\x00' # don't forget to create your variables before you try to
                           # read from them
fp = open(file_name, "rb")

try:
    events = []
    while aBuf[:read_size] != b'\xFA\xFA\xFA\xFA':
        aBuf = fp.read(read_size)
        events.append(aBuf)
        read_count += read_size         
        if read_count >= file_size:
            # pointer cannot be outside of file so minus 144
            fileEnding = aBuf[file_size - 144 : file_size]
            break
except:
    print "File end at position : ", fp.tell()
    import traceback
    traceback.print_exc()

finally:
    fp.close()

Notes:

  1. Instead of comparing for exactly the file size you expect, I suggest using a greater than or equal comparison (>=). Since you're reading four bytes at a time, if you have an odd file size, your comparison will never be true.

  2. After you get this code working, I'd suggest taking it over to Code Review Stack Exchange. As martineau has helpfully pointed out in the comments, there are a number of issues and potential pitfalls in your code that are worth correcting.

Upvotes: 2

Related Questions