Praful Bagai
Praful Bagai

Reputation: 17382

CSV module Python

I've a large CSV files (about a million records). I want to process write each record into a DB.

Since loading the complete file into the RAM makes no sense, hence I need to read the file in chunks (or any other better way).

So, I wrote this code .

import csv
with open ('/home/praful/Desktop/a.csv') as csvfile:
    config_file = csv.reader(csvfile, delimiter = ',', quotechar = '|')
    print config_file
    for row in config_file:
        print row

I guess it loads everything into its memory first and then process.

Upon looking at this thread and many others, I didnt see any difference in o/p code and the solution. Kindly advise, is it the only method for efficient processing of csv files

Upvotes: 0

Views: 72

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121864

No, the csv module produces an iterator; rows are produced on demand. Unless you keep references to row elsewhere the file will not be loaded into memory in its entirety.

Note that that is exactly what I am saying in the other answer you linked to; the problem there is that the OP was building a list (data) holding all rows after reading instead of processing the rows as they were being read.

Upvotes: 2

Related Questions