Joseph Portello
Joseph Portello

Reputation: 97

SQLALCHEMY/PANDAS - SQLAlchemy reading column as CLOB for Pandas to_sql

I have written a dataset to a data-frame.

inv.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43839 entries, 0 to 43838
Data columns (total 16 columns):
MST_CO                 43839 non-null object
LOAD_DATE              43839 non-null object
WHSE_CODE              43839 non-null object
ITEM_NO                43839 non-null object
ITEM_ID                43839 non-null int64
LOCATION               43839 non-null object
LOT_NO                 43839 non-null object
LOT_STATUS             43833 non-null object
LOT_CREATED_DATE       43839 non-null datetime64[ns]
LOT_EXPIRATION_DATE    43839 non-null object
DATE_RECEIVED          43839 non-null datetime64[ns]
ONHAND_QTY1            43839 non-null float64
UOM1                   43839 non-null object
ONHAND_QTY2            43418 non-null float64
UOM2                   43408 non-null object
SOURCE                 43839 non-null object
dtypes: datetime64[ns](2), float64(2), int64(1), object(11)

When I attempt to write the dataframe to SQL, I get the errors below.

inv.to_sql('inventory', db2, 'MST', if_exists='append', index=False, chunksize=3000)
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine SELECT "SYSCAT"."TABLES"."TABNAME" 
FROM "SYSCAT"."TABLES" 
WHERE "SYSCAT"."TABLES"."TABSCHEMA" = ? AND "SYSCAT"."TABLES"."TABNAME" = ?
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine ('DWETL', 'INVENTORY')
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine 
CREATE TABLE inventory (
"MST_CO" CLOB, 
"LOAD_DATE" DATE, 
"WHSE_CODE" CLOB, 
"ITEM_NO" CLOB, 
"ITEM_ID" BIGINT, 
"LOCATION" CLOB, 
"LOT_NO" CLOB, 
"LOT_STATUS" CLOB, 
"LOT_CREATED_DATE" TIMESTAMP, 
"LOT_EXPIRATION_DATE" DATE, 
"DATE_RECEIVED" TIMESTAMP, 
"ONHAND_QTY1" FLOAT(53), 
"UOM1" CLOB, 
"ONHAND_QTY2" FLOAT(53), 
"UOM2" CLOB, 
"SOURCE" CLOB
)
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine ()
2015-03-30 09:33:10,755 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 977, in to_sql
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 538, in to_sql
chunksize=chunksize, dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 1176, in to_sql
table.create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 649, in create
self._execute_create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 634, in _execute_create
self.table.create()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py", line 707, in create
checkfirst=checkfirst)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1728, in _run_visitor
conn._run_visitor(visitorcallable, element, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1357, in _run_visitor
**kwargs).traverse_single(element)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 120, in traverse_single
return meth(obj, **kw)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 732, in visit_table
self.connection.execute(CreateTable(table))
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 69, in _execute_on_connection
return connection._execute_ddl(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 895, in _execute_ddl
compiled
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1271, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_sa/ibm_db.py", line 106, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1335, in execute
self._execute_helper(parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1247, in _execute_helper
raise self.messages[len(self.messages) - 1]
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ibm_db_dbi::ProgrammingError: Statement Execute Failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL1666N  The table definition statement failed because some functionality was specified in the table definition that is not supported with the table type.  Unsupported functionality: "CLOB".  SQLSTATE=42613 SQLCODE=-1666 '\nCREATE TABLE inventory (\n\t"MST_CO" CLOB, \n\t"LOAD_DATE" DATE, \n\t"WHSE_CODE" CLOB, \n\t"ITEM_NO" CLOB, \n\t"ITEM_ID" BIGINT, \n\t"LOCATION" CLOB, \n\t"LOT_NO" CLOB, \n\t"LOT_STATUS" CLOB, \n\t"LOT_CREATED_DATE" TIMESTAMP, \n\t"LOT_EXPIRATION_DATE" DATE, \n\t"DATE_RECEIVED" TIMESTAMP, \n\t"ONHAND_QTY1" FLOAT(53), \n\t"UOM1" CLOB, \n\t"ONHAND_QTY2" FLOAT(53), \n\t"UOM2" CLOB, \n\t"SOURCE" CLOB\n)\n\n' ()

How can I get SQLAlchemy and Pandas to play nicely? I just need to convert the CLOB to be read as STR. Thanks!

Upvotes: 5

Views: 8837

Answers (1)

joris
joris

Reputation: 139162

You can specify the SQL type to use for a certain column with the dtype keyword argument (see docs):

from sqlalchemy.types import String
inv.to_sql('inventory', db2, dtype={'col_name': String})

By default, pandas uses TEXT type for object/string columns, which sqlalchemy maps to CLOB or TEXT, but apparantly your database does not know this type. So with the above, you can manually specify a type that your database does know.

You seem to have multiple columns. So to make this mapping automatic for all object dtyped columns. You can select the columns with dtype object:

cols = df.dtypes[df.dtypes=='object'].index
type_mapping = {col : String for col in cols }
inv.to_sql('inventory', db2, dtype=type_mapping)

Upvotes: 5

Related Questions