Rahul
Rahul

Reputation: 147

How to skip some chunks in python read file code?

I have a code like this:

chunk_size=512*1024 #512 kb
big_file = open(file, 'rb')
while True:
        data = big_file .read(chunk_size)
        if not data:
            break

If I want to read only every 10th item/element or every 5th element, something like this, How can I do it?

chunk_size=512*1024 #512 kb
big_file = open(file, 'rb')
counter = 0
while True:
        counter +=1
        if counter%5!=0:
           big_file.next(chunksize) #Just skip it, don't read it...HOW TO DO THIS LINE?
           continue #I want to skip the chunk, and in the next loop, read the next chunk.
        data = big_file .read(chunk_size)
        if not data:
            break

Speed is very important to me in this case. I will do it for millions of files. I am doing block hashing.

Upvotes: 1

Views: 351

Answers (1)

Lydia van Dyke
Lydia van Dyke

Reputation: 2516

You can use the file's .seek() method for that. I track count of the current location in the file with pos. Data is only read by .read(chunk_size) every 5ths time.

Seeking beyond the file's size is not a problem. data will just be empty then, so we break if nothing was read.

chunk_size=512*1024 #512 kb
big_file = open("filename", 'rb')
counter = 0
pos = 0

while True:
    counter += 1
    if counter % 5 == 0:
        big_file.seek(pos)
        data = big_file.read(chunk_size)
        if not data:
            break
        print(data.decode("utf-8")) # here do your processing

    pos += chunk_size

Upvotes: 2

Related Questions