Reputation: 357
I was trying to do some csv processing using csv reader and was stuck on an issue where I have to iterate over lines read by the csv reader. But on iterating second time, it returns nil since all the lines have already been iterated, is there any way to refresh the iterator to start from the scratch again.
Code:
desc=open("example.csv","r")
Reader1=csv.read(desc)
for lines in Reader1:
(Some code)
for lines in Reader1:
(some code)
what is precisely want to do is read a csv file in the format below
id,price,name x,y,z a,b,c and rearrange it in the format below id:x a price: y b name: z c without using pandas library
Upvotes: 4
Views: 5982
Reputation: 155526
Reset the underlying file object with seek
, adding the following before the second loop:
desc.seek(0)
# Apparently, csv.reader will not refresh if the file is seeked to 0,
# so recreate it
Reader1 = csv.reader(desc)
Mind you, if memory is not a concern, it would typically be faster to read the input into a list
, then iterate the list
twice. Alternatively, you could use itertools.tee
to make two iterators from the initial iterator (it requires similar memory to slurping to list
if you iterate one iterator completely before starting the other, but allows you to begin iterating immediately, instead of waiting for the whole file to be read before you can process any of it). Either approach avoids additional system calls that iterating the file twice would entail. The tee
approach, after the line you create Reader1
on:
# It's not safe to reuse the argument to tee, so we replace it with one of
# the results of tee
Reader1, Reader2 = itertools.tee(Reader1)
for line in Reader1:
...
for line in Reader2:
...
Upvotes: 7