MattiH
MattiH

Reputation: 654

Why does Sqlalchemy fail to insert big amounts of data to MySQL

I recently migrated from SQLite to MySQL and started suffering from this type of problem. Whenever I try using pandas' to_sql via sqlalchemy with more than 100 000 rows, sqlalchemy crashes with following message. I never had such a problem with SQLite (which I used without sqlalchemy). I have neither any problems copying these tables in PyCharm's SQL tool. But whenever I use the combination of pd.to_sql, sqlalchemy and mysql, I get into trouble.

Example code

import numpy as np
import pandas as pd

from sqlalchemy import create_engine

table = pd.DataFrame(np.random.rand(1000000, 10))
connection = create_engine(f"mysql+mysqlconnector://{user}:{pw}@{host}/{db}")
table.to_sql('TEST', connection, if_exists='replace', index=False)

Error message

sqlalchemy.exc.OperationalError: (mysql.connector.errors.OperationalError) 2055: Lost connection to MySQL server at '<host>', system error: 32 Broken pipe[<mySQLQuery>]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

This happens regardless if I use my local mysql database or the one I have in the cloud.

Upvotes: 0

Views: 739

Answers (1)

PApostol
PApostol

Reputation: 2292

The most common reason for this error is that the server timed out and closed the connection. This could happen for large datasets. Try using a chunksize:

table.to_sql('TEST', connection, chunksize=1000, if_exists='replace', index=False)

Upvotes: 1

Related Questions