Reputation: 51
[LATEST UPDATE] Thanks to Jack's enormous help!!! I managed to connect to the Cloud SQL postgres DB and read/write my dataframes to the database. However, I am still experiencing the same error that I experienced previously, which is...
struct.error: 'h' format requires -32768 <= number <= 32767
This error doesnt happen when the dataframes are small, compact and columns do not have too many NaN values in them. However, when there are many NaN values in the columns, the program throws the following error.
Separately I have tried using df = df.fillna(0)
to fill the NaN values with 0. But it did not work as well, and the same error surfaced. Please help!
Traceback (most recent call last):
File "...\falcon_vbackup\STEP5_SavetoDB_and_SendEmail.py", line 81, in <module>
main_SavetoDB_and_SendEmail(
File "...\falcon_vbackup\STEP5_SavetoDB_and_SendEmail.py", line 37, in main_SavetoDB_and_SendEmail
Write_Dataframe_to_SQLTable(
File "...\falcon_vbackup\APPENDIX_Database_ReadWrite_v2.py", line 143, in Write_Dataframe_to_SQLTable
df_Output.to_sql(sql_tablename, con=conn, schema='public', index=False, if_exists=if_exists, method='multi', chunksize=1000)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\core\generic.py", line 2963, in to_sql
return sql.to_sql(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\io\sql.py", line 697, in to_sql
return pandas_sql.to_sql(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\io\sql.py", line 1739, in to_sql
total_inserted = sql_engine.insert_records(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\io\sql.py", line 1322, in insert_records
return table.insert(chunksize=chunksize, method=method)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\io\sql.py", line 950, in insert
num_inserted = exec_insert(conn, keys, chunk_iter)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pandas\io\sql.py", line 873, in _execute_insert_multi
result = conn.execute(stmt)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\base.py", line 1289, in execute
return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\sql\elements.py", line 325, in _execute_on_connection
return connection._execute_clauseelement(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\base.py", line 1481, in _execute_clauseelement
ret = self._execute_context(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\base.py", line 1845, in _execute_context
self._handle_dbapi_exception(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\base.py", line 2030, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\util\compat.py", line 207, in raise_
raise exception
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\base.py", line 1802, in _execute_context
self.dialect.do_execute(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\sqlalchemy\engine\default.py", line 732, in do_execute
cursor.execute(statement, parameters)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pg8000\dbapi.py", line 455, in execute
self._context = self._c.execute_unnamed(
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pg8000\core.py", line 627, in execute_unnamed
self.send_PARSE(NULL_BYTE, statement, oids)
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\pg8000\core.py", line 601, in send_PARSE
val.extend(h_pack(len(oids)))
struct.error: 'h' format requires -32768 <= number <= 32767
Exception ignored in: <function Connector.__del__ at 0x00000213190D8700>
Traceback (most recent call last):
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\site-packages\google\cloud\sql\connector\connector.py", line 167, in __del__
File "c:\Users\ng_yj\.conda\envs\venv_falcon\lib\concurrent\futures\_base.py", line 447, in result
concurrent.futures._base.TimeoutError:
I have setup a postgresql in GCP's Cloud SQL. I am trying to connect to it using google.cloud.sql.connector. I have created a Service Account from the GCP Console, and downloaded the json keys. I want to use a service account , credentials/ keys (in the format of reading a .json file placed in the same directory as my main.py code) to authenticate access to cloud_sql. I am trying to authenticate, but I keep getting an error that says that the service account json file was not found.
Can anyone help to figure out how to fix this error? Thank you!
import pandas as pd
from google.cloud.sql.connector import connector
import os
import pandas as pd
import pandas as pd
import sqlalchemy
import os
# configure Cloud SQL Python Connector properties
def getconn():
conn = connector.connect(
os.environ['LL_DB_INSTANCE_CONNECTION_NAME'],
"pg8000",
user=os.environ['LL_DB_USER'],
password=os.environ['LL_DB_PASSWORD'],
db=os.environ['LL_DB_NAME'])
return conn
# Show existing SQLTables within database
def Show_SQLTables_in_Database(conn):
if conn!=None:
# Show what tables remain in database
results = conn.execute("""SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public'""").fetchall()
for table in results:
print(table)
if __name__=="__main__":
# Set the Google Application Credentials as environment variable
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.path.join(os.getcwd(),"Google-Credentials-LL-tech2.json")
# create connection pool to re-use connections
pool = sqlalchemy.create_engine("postgresql+pg8000://", creator=getconn)
with pool.connect() as db_conn:
# Show what tables remain in database
results = db_conn.execute("""SELECT table_name FROM information_schema.tables
WHERE table_schema = 'public'""").fetchall()
for table in results:
print(table)
Upvotes: 3
Views: 1100
Reputation: 1979
Responding to your latest update of the error.
First, make sure that your service account has the Cloud SQL Client role applied to it.
Secondly, try executing the following basic script prior to your custom configuration, this will help isolate the error to the Python Connector or the service account/implementation.
The following should just connect to your database and print the time.
from google.cloud.sql.connector import connector
import sqlalchemy
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.path.join(os.path.dirname(os.path.abspath(__file__)),"GSheet-Credentials-LL-tech2.json")
# build connection for db using Python Connector
def getconn():
conn = connector.connect(
os.environ['LL_DB_INSTANCE_CONNECTION_NAME'],
"pg8000",
user=os.environ['LL_DB_USER'],
password=os.environ['LL_DB_PASSWORD'],
db=os.environ['LL_DB_NAME'],
)
return conn
# create connection pool
pool = sqlalchemy.create_engine("postgresql+pg8000://", creator=getconn)
def db_connect():
with pool.connect() as conn:
current_time = conn.execute(
"SELECT NOW()").fetchone()
print(f"Time: {str(current_time[0])}")
db_connect()
If that still gives the error, please provide the full stacktrace of the error so that I can try and debug it further with more info.
Upvotes: 1
Reputation: 1979
The error you are seeing means that the .json
file is not being found. This is most likely being caused by os.getcwd()
which gets the path of the current working directory from where main.py
is being called. This leads to errors if you are calling the file from anywhere other than the parent directory.
Working case: python main.py
Error case: python folder/main.py
Change the line where you set credentials to the following:
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.path.join(os.path.dirname(os.path.abspath(__file__)),"Google-Credentials-LL-tech2.json")
This will allow the credentials path to be properly set for all cases of where your main.py
is called from.
Upvotes: 1