Reputation: 97
I have written a dataset to a data-frame.
inv.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 43839 entries, 0 to 43838
Data columns (total 16 columns):
MST_CO 43839 non-null object
LOAD_DATE 43839 non-null object
WHSE_CODE 43839 non-null object
ITEM_NO 43839 non-null object
ITEM_ID 43839 non-null int64
LOCATION 43839 non-null object
LOT_NO 43839 non-null object
LOT_STATUS 43833 non-null object
LOT_CREATED_DATE 43839 non-null datetime64[ns]
LOT_EXPIRATION_DATE 43839 non-null object
DATE_RECEIVED 43839 non-null datetime64[ns]
ONHAND_QTY1 43839 non-null float64
UOM1 43839 non-null object
ONHAND_QTY2 43418 non-null float64
UOM2 43408 non-null object
SOURCE 43839 non-null object
dtypes: datetime64[ns](2), float64(2), int64(1), object(11)
When I attempt to write the dataframe to SQL, I get the errors below.
inv.to_sql('inventory', db2, 'MST', if_exists='append', index=False, chunksize=3000)
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine SELECT "SYSCAT"."TABLES"."TABNAME"
FROM "SYSCAT"."TABLES"
WHERE "SYSCAT"."TABLES"."TABSCHEMA" = ? AND "SYSCAT"."TABLES"."TABNAME" = ?
2015-03-30 09:33:10,656 INFO sqlalchemy.engine.base.Engine ('DWETL', 'INVENTORY')
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine
CREATE TABLE inventory (
"MST_CO" CLOB,
"LOAD_DATE" DATE,
"WHSE_CODE" CLOB,
"ITEM_NO" CLOB,
"ITEM_ID" BIGINT,
"LOCATION" CLOB,
"LOT_NO" CLOB,
"LOT_STATUS" CLOB,
"LOT_CREATED_DATE" TIMESTAMP,
"LOT_EXPIRATION_DATE" DATE,
"DATE_RECEIVED" TIMESTAMP,
"ONHAND_QTY1" FLOAT(53),
"UOM1" CLOB,
"ONHAND_QTY2" FLOAT(53),
"UOM2" CLOB,
"SOURCE" CLOB
)
2015-03-30 09:33:10,731 INFO sqlalchemy.engine.base.Engine ()
2015-03-30 09:33:10,755 INFO sqlalchemy.engine.base.Engine ROLLBACK
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 977, in to_sql
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 538, in to_sql
chunksize=chunksize, dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 1176, in to_sql
table.create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 649, in create
self._execute_create()
File "/usr/local/lib/python2.7/dist-packages/pandas/io/sql.py", line 634, in _execute_create
self.table.create()
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/schema.py", line 707, in create
checkfirst=checkfirst)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1728, in _run_visitor
conn._run_visitor(visitorcallable, element, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1357, in _run_visitor
**kwargs).traverse_single(element)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 120, in traverse_single
return meth(obj, **kw)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 732, in visit_table
self.connection.execute(CreateTable(table))
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 841, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/ddl.py", line 69, in _execute_on_connection
return connection._execute_ddl(self, multiparams, params)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 895, in _execute_ddl
compiled
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1070, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1271, in _handle_dbapi_exception
exc_info
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1063, in _execute_context
context)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_sa/ibm_db.py", line 106, in do_execute
cursor.execute(statement, parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1335, in execute
self._execute_helper(parameters)
File "/usr/local/lib/python2.7/dist-packages/ibm_db_dbi.py", line 1247, in _execute_helper
raise self.messages[len(self.messages) - 1]
sqlalchemy.exc.ProgrammingError: (ProgrammingError) ibm_db_dbi::ProgrammingError: Statement Execute Failed: [IBM][CLI Driver][DB2/LINUXX8664] SQL1666N The table definition statement failed because some functionality was specified in the table definition that is not supported with the table type. Unsupported functionality: "CLOB". SQLSTATE=42613 SQLCODE=-1666 '\nCREATE TABLE inventory (\n\t"MST_CO" CLOB, \n\t"LOAD_DATE" DATE, \n\t"WHSE_CODE" CLOB, \n\t"ITEM_NO" CLOB, \n\t"ITEM_ID" BIGINT, \n\t"LOCATION" CLOB, \n\t"LOT_NO" CLOB, \n\t"LOT_STATUS" CLOB, \n\t"LOT_CREATED_DATE" TIMESTAMP, \n\t"LOT_EXPIRATION_DATE" DATE, \n\t"DATE_RECEIVED" TIMESTAMP, \n\t"ONHAND_QTY1" FLOAT(53), \n\t"UOM1" CLOB, \n\t"ONHAND_QTY2" FLOAT(53), \n\t"UOM2" CLOB, \n\t"SOURCE" CLOB\n)\n\n' ()
How can I get SQLAlchemy and Pandas to play nicely? I just need to convert the CLOB to be read as STR. Thanks!
Upvotes: 5
Views: 8837
Reputation: 139162
You can specify the SQL type to use for a certain column with the dtype
keyword argument (see docs):
from sqlalchemy.types import String
inv.to_sql('inventory', db2, dtype={'col_name': String})
By default, pandas uses TEXT
type for object/string columns, which sqlalchemy maps to CLOB
or TEXT
, but apparantly your database does not know this type. So with the above, you can manually specify a type that your database does know.
You seem to have multiple columns. So to make this mapping automatic for all object dtyped columns. You can select the columns with dtype object:
cols = df.dtypes[df.dtypes=='object'].index
type_mapping = {col : String for col in cols }
inv.to_sql('inventory', db2, dtype=type_mapping)
Upvotes: 5