Reputation: 1
I'm trying to collect all data on an SQL Server bdd hosted in Cloud SQL with a Cloud Function.
I use pytds and pandas to collect the data from SQL Server and I'm using pandas_gbq.to_gbq
to insert data into Google Big Query.
I've an issue with some tables with string columns.
The error gets raised when I try to do df = pd.read_sql(query, pool.connect())
Here is my code :
# initialize Connector object
connector = Connector()
def getconn() -> pytds.Connection:
conn = connector.connect(
instance_connection_name,
"pytds",
user=db_user,
password=db_pass,
db=db_name,
bytes_to_unicode=True
)
return conn
pool = sqlalchemy.create_engine(
"mssql+pytds://",
creator=getconn,
pool_size = 20,
max_overflow = 0,
pool_timeout = 600,
pool_recycle = 1800,
connect_args={"encoding": "latin1"}
# ...
)
query = ('select * from article')
df = pd.read_sql(query, pool.connect()). # Issue with some table (like article for this example)
I have already try to specify the encoding of the database in the create_engine order but it's not sufficient.
Thanks
Upvotes: 0
Views: 154
Reputation: 1979
I was able to get it working with the following package versions:
pip install cloud-sql-python-connector==1.4.2 SQLAlchemy==2.0.22 sqlalchemy-pytds==1.0.0 pandas==2.1.1
I made a few slight changes to the code because the query
has to be sqlalchemy.text()
format:
from google.cloud.sql.connector import Connector
import sqlalchemy
connector = Connector()
# function to return the database connection object
def getconn():
conn = connector.connect(
INSTANCE_CONNECTION_NAME,
"pytds",
user=DB_USER,
password=DB_PASS,
db=DB_NAME,
# worked with and without bytes_to_unicode
)
return conn
# create connection pool with 'creator' argument to our connection object function
pool = sqlalchemy.create_engine(
"mssql+pytds://",
creator=getconn,
# worked with or without encoding
)
query = sqlalchemy.text("SELECT * FROM article")
# connect to connection pool
with pool.connect() as db_conn:
df = pd.read_sql(query, db_conn)
print(df.head())
Upvotes: 0