Reputation: 7179
I am unable to assign an unsigned int
type when using .to_sql()
to write my dataframe to a MySQL database. I can use the other int
types, but just am unable to get unsigned
. A small representative sample of what I am trying looks like this:
import pandas as pd
from sqlalchemy import create_engine
import sqlalchemy.types as sql_types
db_engine = create_engine('mysql://db_user:db_pass@db_host:db_port/db_schema')
d = {'id': [100,101,102], 'items': [6,10,20000], 'problems': [50,72,2147483649]} # Representative sample dictionary
df = pd.DataFrame(d).set_index('id')
This gives:
>>> df
items problems
id
100 6 50
101 10 72
102 20000 2147483649
I write to the database as follows:
df.to_sql('my_table',
db_engine,
flavor='mysql',
if_exists='replace',
index_label=['id'],
dtype={'id': sql_types.SMALLINT,
'items': sql_types.INT,
'problems': sql_types.INT}
But what happens is the value of problems
in the last row (id==102
) gets truncated to 2147483647
(which is 2^31-1
) when written to the db.
There are no other issues in the connection or when writing other standard data types, including int
. I could get away by using the sql_types.BIGINT
option instead (making the maximum 2^63-1
), but that would really just be unnecessary as I know my values would fall below 4294967296
(2^32-1
), which is basically the unsigned int
maximum.
So question is: How can I assign an unsigned int
field using the .to_sql()
approach above?
I have used the sqlalchemy
types from here.
The MySQL types I see are here.
I have seen the question here which does get the unsigned int for MySQL, but it is not using the .to_sql()
approach I would like to use. If I can simply create the table from the single .to_sql()
statement, that would be ideal.
Upvotes: 4
Views: 2135
Reputation: 139162
To get an unsigned int, you can specify this in the sqlalchemy constructor of the INTEGER type for mysql (see the docs on the mysql types of sqlalchemy):
In [23]: from sqlalchemy.dialects import mysql
In [24]: mysql.INTEGER(unsigned=True)
Out[24]: INTEGER(unsigned=True)
So you can provide this to the dtype
argument in to_sql
instead of the more general sql_types.INT
:
dtype={'problems': mysql.INTEGER(unsigned=True), ...}
Note: you need at least pandas 0.16.0 to have this working.
Upvotes: 2