Reputation: 4323
I need to extract a big amount of data(>1GB) from a database to a csv file. I'm using this script:
rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
FROM table1""" % {"sql_fields": sql_fields})
sqlData = rs_cursor.fetchall()
rs_cursor.close()
c = csv.writer(open(filename, "wb"))
c.writerow(headers)
for row in sqlData:
c.writerow(row)
The problem comes when is writing the file the system runs out of memory. In this case, is there any other and more efficient way to create a large csv file?
Upvotes: 1
Views: 2844
Reputation: 81614
psycopg2
(which OP uses) has a fetchmany
method which accepts a size
argument. Use it to read a certain number of lines from the database. You can expirement with the value of n
to balance between run-time and memory usage.
fetchmany
docs: http://initd.org/psycopg/docs/cursor.html#cursor.fetchmany
rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
FROM table1""" % {"sql_fields": sql_fields})
c = csv.writer(open(filename, "wb"))
c.writerow(headers)
n = 100
sqlData = rs_cursor.fetchmany(n)
while sqlData:
for row in sqlData:
c.writerow(row)
sqlData = rs_cursor.fetchmany(n)
rs_cursor.close()
You can also wrap this with a generator to simplify the code a little bit:
def get_n_rows_from_table(n):
rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
FROM table1""" % {"sql_fields": sql_fields})
sqlData = rs_cursor.fetchmany(n)
while sqlData:
yield sqlData
sqlData = rs_cursor.fetchmany(n)
rs_cursor.close()
c = csv.writer(open(filename, "wb"))
c.writerow(headers)
for row in get_n_rows_from_table(100):
c.writerow(row)
Upvotes: 3
Reputation: 131
Have you tried fetchone()?
rs_cursor = rs_db.cursor()
rs_cursor.execute("""SELECT %(sql_fields)s
FROM table1""" % {"sql_fields": sql_fields})
c = csv.writer(open(filename, "wb"))
c.writerow(headers)
row = rs_cursor.fetchone()
while row:
c.writerow(row)
row = rs_cursor.fetchone()
rs_cursor.close()
Upvotes: 0