Reputation: 298
Let me create an use case to discuss on.
CREATE DATABASE sample;
USE sample;
CREATE TABLE quote (
`id` int(2) unsigned NOT NULL AUTO_INCREMENT,
`code` text ,
`date` date DEFAULT NULL,
`close` double DEFAULT NULL,
PRIMARY KEY (`id`)
) ;
INSERT INTO quote (`code`, `date`, `close`)
VALUES ('epm', '20200824', 2.64);
INSERT INTO quote (`code`, `date`, `close`)
VALUES ('dss', '20200824', 6.4);
It is simple to execute only one sql command with sqlalchemy.
import pandas as pd
from sqlalchemy import create_engine
user = 'root'
mysql_pass = 'your mysql passwd'
mysql_ip = '127.0.0.1'
engine = create_engine("mysql+pymysql://{}:{}@{}:3306".format(user,mysql_pass,mysql_ip))
cmd_one_line_sql = 'select * from sample.quote;'
df = pd.read_sql_query(cmd_one_line_sql,con = engine)
df
id code date close
0 1 epm 2020-08-24 2.64
1 2 dss 2020-08-24 6.40
I get the desired result,now the cmd contains multiple sql commands,for simplicity,it contains only two lines
cmd_multi_lines_sql = 'use sample;select * from quote;'
The cmd_multi_lines_sql
just split cmd_one_line_sql
as two.
I rewrite the code snippet according to manual:
execute many sql commands with sqlalchemy
import pandas as pd
from sqlalchemy import create_engine
user = 'root'
mysql_pass = 'your mysql passwd'
mysql_ip = '127.0.0.1'
engine = create_engine("mysql+pymysql://{}:{}@{}:3306".format(user,mysql_pass,mysql_ip))
connection = engine.raw_connection()
cmd_multi_lines_sql = 'use sample;select * from quote;'
try:
cursor = connection.cursor()
cursor.execute(cmd_multi_lines_sql)
results_one = cursor.fetchall()
finally:
connection.close()
Get the below error info:
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 170, in execute
result = self._query(query)
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 328, in _query
conn.query(q)
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 517, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 732, in _read_query_result
result.read()
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1075, in read
first_packet = self.connection._read_packet()
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 684, in _read_packet
packet.check_error()
File "/usr/local/lib/python3.5/dist-packages/pymysql/protocol.py", line 220, in check_error
err.raise_mysql_exception(self._data)
File "/usr/local/lib/python3.5/dist-packages/pymysql/err.py", line 109, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'select * from quote' at line 1")
Another try:
vim /tmp/test.sql
use sample;
select * from quote;
#write the commands in `/tmp/test.sql`
f = open('/tmp/test.sql','r')
cmd = f.read()
df = pd.read_sql_query(cmd, con = engine)
It output the same error info.How to fix it?
Upvotes: 3
Views: 10095
Reputation: 49395
After some research and asking at github the answer is obvious
you need to pass the needed parameters with
connect_args=
And the parameter since sqlalchemy
is
{"client_flag": MULTI_STATEMENTS}
So your python code locks like his
from sqlalchemy import create_engine
import pymysql
from pymysql.constants.CLIENT import MULTI_STATEMENTS
user = 'root'
mysql_pass = 'testpassword'
mysql_ip = 'localhost'
cmd = 'SELECT * FROM table1;SELECT * FROM test'
engine = create_engine("mysql+pymysql://{}:{}@{}:3306/testdb1?charset=utf8".format(user,mysql_pass,mysql_ip),connect_args={"client_flag": MULTI_STATEMENTS})
connection = engine.raw_connection()
try:
cursor = connection.cursor()
cursor.execute(cmd)
results_one = cursor.fetchall()
cursor.nextset()
results_two = cursor.fetchall()
cursor.close()
finally:
connection.close()
But with this solution you need to know before hand which queries you run.
If you want to be more flexible, with dynamic sql statements
from sqlalchemy import create_engine
user = 'root'
mysql_pass = 'testpassword'
mysql_ip = 'localhost'
cmd = 'SELECT * FROM table1;SELECT * FROM test'
engine = create_engine("mysql+pymysql://{}:{}@{}:3306/testdb1?charset=utf8".format(user,mysql_pass,mysql_ip))
connection = engine.raw_connection()
splitstring = cmd.split(";")
ges_resultset = []
try:
cursor = connection.cursor()
for cmdoneonly in splitstring:
cursor.execute(cmdoneonly)
results = cursor.fetchall()
ges_resultset.append(results)
cursor.close()
finally:
connection.close()
Where you can check every single command and know how python can react to it
Upvotes: 6
Reputation: 123819
The issues you face are:
MULTI_STATEMENTS
flag to PyMySQL, andread_sql_query
assumes that the first result set contains the data for the DataFrame, and that may not be true for an anonymous code block.You can create your own PyMySQL connection and retrieve the data like this:
import pandas as pd
import pymysql
from pymysql.constants import CLIENT
conn_info = {
"host": "localhost",
"port": 3307,
"user": "root",
"password": "toot",
"database": "mydb",
"client_flag": CLIENT.MULTI_STATEMENTS,
}
cnxn = pymysql.connect(**conn_info)
crsr = cnxn.cursor()
sql = """\
CREATE TEMPORARY TABLE tmp (id int primary key, txt varchar(20))
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO tmp (id, txt) VALUES (1, 'foo'), (2, 'ΟΠΑ!');
SELECT id, txt FROM tmp;
"""
crsr.execute(sql)
num_tries = 5
result = None
for i in range(num_tries):
result = crsr.fetchall()
if result:
break
crsr.nextset()
if not result:
print(f"(no result found after {num_tries} attempts)")
else:
df = pd.DataFrame(result, columns=[x[0] for x in crsr.description])
print(df)
"""console output:
id txt
0 1 foo
1 2 ΟΠΑ!
"""
(Edit) Additional notes:
Note 1: As mentioned in another answer, you can use the connect_args
argument to SQLAlchemy's create_engine
method to pass the MULTI_STATEMENTS
flag. If you need a SQLAlchemy Engine
object for other things (e.g., for to_sql
) then that might be preferable to creating your own PyMySQL connection directly.
Note 2: num_tries
can be arbitrarily large; it is simply a way of avoiding an endless loop. If we need to skip the first n empty result sets then we need to call nextset
that many times regardless, and once we've found the non-empty result set we break
out of the loop.
Upvotes: 4
Reputation: 298
@Gord Thompson,i make a little improvement to set num_tries automatically:
import pandas as pd
import pymysql
from pymysql.constants import CLIENT
conn_info = {
"host": "localhost",
"port": 3306,
"user": "root",
"password": "your mysql passwd",
"client_flag": CLIENT.MULTI_STATEMENTS,
}
cnxn = pymysql.connect(**conn_info)
crsr = cnxn.cursor()
sql = """\
create database sample;
USE sample;
CREATE TEMPORARY TABLE tmp (id int primary key, txt varchar(20))
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO tmp (id, txt) VALUES (1, 'foo'), (2, 'ΟΠΑ!');
SELECT id, txt FROM tmp;
SELECT txt FROM tmp;
"""
crsr.execute(sql)
num_tries = sql.count(';') if sql.endswith(';') else sql.count(';') + 1
for i in range(num_tries):
result = crsr.fetchall()
if result:
df = pd.DataFrame(result, columns=[x[0] for x in crsr.description])
print(df)
crsr.nextset()
@nbk:when the cmd
contain many sql statements,to execute your code may encounter such issue as :
pymysql.err.InternalError: (1065, 'Query was empty')
Make a little improvement based on your code:
import pandas as pd
from sqlalchemy import create_engine
user = 'root'
mysql_pass = 'your mysql passwd'
mysql_ip = 'localhost'
sql = """\
create database sample;
USE sample;
CREATE TEMPORARY TABLE tmp (id int primary key, txt varchar(20))
ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO tmp (id, txt) VALUES (1, 'foo'), (2, 'ΟΠΑ!');
SELECT id, txt FROM tmp;
SELECT txt FROM tmp;
"""
engine = create_engine("mysql+pymysql://{}:{}@{}:3306".format(user,mysql_pass,mysql_ip))
connection = engine.raw_connection()
splitstring = sql.split(";")
try:
cursor = connection.cursor()
for cmdoneonly in splitstring:
if cmdoneonly.strip():
cursor.execute(cmdoneonly)
results = cursor.fetchall()
if results :
df = pd.DataFrame(results, columns=[x[0] for x in cursor.description])
print(df)
cursor.close()
finally:
connection.close()
it is necessary to add a determine statement if cmdoneonly.strip():
to avoid 1065:Query was empty
error.
it is a wonderful statement df = pd.DataFrame(results, columns=[x[0] for x in cursor.description])
learning from @Gord Thompson
.
Upvotes: 0