Reputation: 211
Could someone please point me in the right direction on how to solve this following problem. I am trying to come up with a solution using pandas.read_sql and asyncio. I want to migrate table records from 1 database to another database.
I want to do the following:
table 1
.
.
.
table n
I have the function:
def extract(table):
try:
df = pd.DataFrame()
df = pd.concat(
[chunk for chunk in
pd.read_sql(sql,
con=CONNECTION,
chunksize=10**5)]
)
except Exception as e:
raise e
else:
return df
I want to run these in parallel and not one by one.
extract(table1)
extract(table2)
.
.
extract(tablen)
Upvotes: 2
Views: 7026
Reputation: 155366
asyncio is about organizing non-blocking code into callbacks and coroutines. Running CPU-intensive code in parallel is a use case for threads:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor() as executor:
frames = list(executor.map(extract, all_tables))
Whether this will actually run faster than sequential code depends on whether pd.read_sql
releases the GIL.
Upvotes: 4