Reputation:
I need to fetch data from DB and write it to a corresponding column in CSV file. The following code do it very slow(iteratevly, one by one)
async def fetch_and_write():
conn = await asyncpg.connect('...')
with open('/Users/mac/Desktop/input.csv','r') as csvinput:
with open('/Users/mac/Desktop/output.csv', 'w') as csvoutput:
reader = csv.reader(csvinput)
writer = csv.writer(csvoutput, lineterminator='\n')
all = []
row = next(reader)
row.append('new_column_name')
all.append(row)
for row in reader:
query = "SELECT .. FROM .. WHERE id = '%s';"
query = query % row[14]
try:
result = await conn.fetch(query)
except BaseException:
print("Oops!That was no valid number.")
continue
row.append(result[0][0])
all.append(row)
writer.writerows(all)
How can I read id from CSV in chunks and use "in" clause to improve performance?
Upvotes: 0
Views: 932
Reputation: 99
As per my suggestion in comment, You can fetch "n" number of records in one query. Below is the modified version of code provided by you.
Not Tested
async def fetch_and_write():
n = 500 # fetch #rows at once
conn = await asyncpg.connect('...')
with open('/Users/mac/Desktop/input.csv','r') as csvinput:
with open('/Users/mac/Desktop/output.csv', 'w') as csvoutput:
reader = csv.reader(csvinput)
writer = csv.writer(csvoutput, lineterminator='\n')
all = []
ids_list = []
row = next(reader)
row.append('new_column_name')
all.append(row)
for row in reader:
ids_list.append(row[14])
if len(ids_list) >= n:
in_p=', '.join(map(lambda x: '%s', args))
query = "SELECT .. FROM .. WHERE id in '%s';"
query = query % in_p
try:
result = await conn.fetch(query)
except BaseException:
print("Oops!That was no valid number.")
continue
ids_list = []
row.append(result[0][0])
writer.writerows(all)
all = []
if len(ids_list)>0:
in_p=', '.join(map(lambda x: '%s', args))
query = "SELECT .. FROM .. WHERE id in '%s';"
query = query % in_p
try:
result = await conn.fetch(query)
except BaseException:
print("Oops!That was no valid number.")
continue
row.append(result[0][0])
writer.writerows(all)
Upvotes: 0
Reputation: 1155
You can use postgres' Copy
command to do the trick.
e.g. your query should be
Copy (Select * From foo) To '/tmp/test.csv' With CSV DELIMITER ',';
Upvotes: 1