df.to_sql gives TypeError with dtype=sqlalchemy.timestamp(timezone=True)

Question

I'm trying to use DataFrame().to_sql to input a time aware dataframe series. Here is an example of my code.

times = ['201412120154', '201412110254']

df = pd.DataFrame()
df['time'] = pd.to_datetime(times, utc=True)

df.time.to_sql('test', engine, 
         dtype={'time': sqlalchemy.TIMESTAMP(timezone=True)})

The error I recieve is:

TypeError: issubclass() arg 1 must be a class

The following code works but obviously results in a postgresql column that is not timezone aware.

times = ['201412120154', '201412110254']

df = pd.DataFrame()
df['time'] = pd.to_datetime(times, utc=True)

df.time.to_sql('test', engine, 
               dtype={'time': sqlalchemy.TIMESTAMP})

I'm using python 2.7, pandas 0.15.2, postsgresql 9.3 and SQLAlchemy 0.9.7

joris · Accepted Answer

Update: this is fixed in 0.16

This is a bug in pandas 0.15.2 that hinders you from providing an instantiated sqlalchemy type with arguments (like TIMESTAMP(timezone=True) instead of TIMESTAMP). This will be fixed in a next version, but for now you can use the patch below.

I will post the workaround here as well. If you run this, you will be able to specify sqlalchemy types instantiated with arguments to the dtype keyword in to_sql:

from pandas.io.sql import SQLTable

def to_sql(self, frame, name, if_exists='fail', index=True,
           index_label=None, schema=None, chunksize=None, dtype=None):
    """
    patched version of https://github.com/pydata/pandas/blob/v0.15.2/pandas/io/sql.py#L1129
    """
    if dtype is not None:
        from sqlalchemy.types import to_instance, TypeEngine
        for col, my_type in dtype.items():
            if not isinstance(to_instance(my_type), TypeEngine):
                raise ValueError('The type of %s is not a SQLAlchemy '
                                 'type ' % col)

    table = SQLTable(name, self, frame=frame, index=index,
                     if_exists=if_exists, index_label=index_label,
                     schema=schema, dtype=dtype)
    table.create()
    table.insert(chunksize)

pd.io.sql.SQLDatabase.to_sql = to_sql

df.to_sql gives TypeError with dtype=sqlalchemy.timestamp(timezone=True)

Answers (1)

Related Questions