Reputation: 21
I am using pandas to read mysql tables. However often after a read_sql statement I get a table lock on the table. Below are the queries,
mysql_cn= MySQLdb.connect(host='localhost', port=3306,user='root',passwd='mysql', db='DB_P001')
dfVars = pd.read_sql('select * from markeff_5_varlist', con=mysql_cn, chunksize = 10)
Once I run the the dfVars dataframe is populated, however there is a read lock in mysql. This locks persists until mysql is restarted.
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread process no. 2567, id 140597860407040, state: sleeping
Number of rows inserted 0, updated 0, deleted 0, read 11494
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 3.31 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
I would really appreciate help.
Thanks,
Upvotes: 2
Views: 1869
Reputation: 61
This is an issue with using chunksize
. The chunksize parameter as @MaxU points out in his comment creates an iterator. This means you are submitting a query and ready to receive the results in chunks from the database. Until you consume the entire iterator, the database will be locked.
If the database was not locked, then it would not be able to guarantee the results are accurate of your query while simultaneously allowing writes/edits to the database.
To exhaust the iterator use next(dfVars)
until it is expended or loop over the iterator like:
for df_chunk in pd.read_sql(... chunksize=...):
<do something>
Hope this helps.
Upvotes: 1