Reputation: 17382
I've a large CSV files (about a million records). I want to process write each record into a DB.
Since loading the complete file into the RAM makes no sense, hence I need to read the file in chunks (or any other better way).
So, I wrote this code .
import csv
with open ('/home/praful/Desktop/a.csv') as csvfile:
config_file = csv.reader(csvfile, delimiter = ',', quotechar = '|')
print config_file
for row in config_file:
print row
I guess it loads everything into its memory first and then process.
Upon looking at this thread and many others, I didnt see any difference in o/p code and the solution. Kindly advise, is it the only method for efficient processing of csv files
Upvotes: 0
Views: 72
Reputation: 1121864
No, the csv
module produces an iterator; rows are produced on demand. Unless you keep references to row
elsewhere the file will not be loaded into memory in its entirety.
Note that that is exactly what I am saying in the other answer you linked to; the problem there is that the OP was building a list (data
) holding all rows after reading instead of processing the rows as they were being read.
Upvotes: 2