Reputation: 147
I have a code like this:
chunk_size=512*1024 #512 kb
big_file = open(file, 'rb')
while True:
data = big_file .read(chunk_size)
if not data:
break
If I want to read only every 10th item/element or every 5th element, something like this, How can I do it?
chunk_size=512*1024 #512 kb
big_file = open(file, 'rb')
counter = 0
while True:
counter +=1
if counter%5!=0:
big_file.next(chunksize) #Just skip it, don't read it...HOW TO DO THIS LINE?
continue #I want to skip the chunk, and in the next loop, read the next chunk.
data = big_file .read(chunk_size)
if not data:
break
Speed is very important to me in this case. I will do it for millions of files. I am doing block hashing.
Upvotes: 1
Views: 351
Reputation: 2516
You can use the file's .seek()
method for that. I track count of the current location in the file with pos
. Data is only read by .read(chunk_size)
every 5ths time.
Seeking beyond the file's size is not a problem. data
will just be empty then, so we break if nothing was read.
chunk_size=512*1024 #512 kb
big_file = open("filename", 'rb')
counter = 0
pos = 0
while True:
counter += 1
if counter % 5 == 0:
big_file.seek(pos)
data = big_file.read(chunk_size)
if not data:
break
print(data.decode("utf-8")) # here do your processing
pos += chunk_size
Upvotes: 2