Reputation: 654
I recently migrated from SQLite to MySQL and started suffering from this type of problem. Whenever I try using pandas' to_sql via sqlalchemy with more than 100 000 rows, sqlalchemy crashes with following message. I never had such a problem with SQLite (which I used without sqlalchemy). I have neither any problems copying these tables in PyCharm's SQL tool. But whenever I use the combination of pd.to_sql, sqlalchemy and mysql, I get into trouble.
Example code
import numpy as np
import pandas as pd
from sqlalchemy import create_engine
table = pd.DataFrame(np.random.rand(1000000, 10))
connection = create_engine(f"mysql+mysqlconnector://{user}:{pw}@{host}/{db}")
table.to_sql('TEST', connection, if_exists='replace', index=False)
Error message
sqlalchemy.exc.OperationalError: (mysql.connector.errors.OperationalError) 2055: Lost connection to MySQL server at '<host>', system error: 32 Broken pipe[<mySQLQuery>]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
This happens regardless if I use my local mysql database or the one I have in the cloud.
Upvotes: 0
Views: 739
Reputation: 2292
The most common reason for this error is that the server timed out and closed the connection. This could happen for large datasets. Try using a chunksize
:
table.to_sql('TEST', connection, chunksize=1000, if_exists='replace', index=False)
Upvotes: 1