Reputation: 3801
Trying to import a database table for data analysis using Pandas. I have a source table with multiple columns like so:
ID float NOT NULL,
Name varchar(36) NOT NULL,
Address varchar(100)
When I pull it into a dataframe and run the following:
df.info()
I get:
ID float64
Name object
Address object
Is there a way to get it to read the exact data defintion? i.e. "varchar(36)" instead of "object".
This is reading from a Teradata table, if that makes a difference
Thanks
Upvotes: 0
Views: 199
Reputation: 494
pandas relies of numpy data types.
Visit the related part of the pandas docs which has more information into it but I'll copy all the types from there:
[numpy.generic,
[[numpy.number,
[[numpy.integer,
[[numpy.signedinteger,
[numpy.int8,
numpy.int16,
numpy.int32,
numpy.int64,
numpy.int64,
numpy.timedelta64]],
[numpy.unsignedinteger,
[numpy.uint8,
numpy.uint16,
numpy.uint32,
numpy.uint64,
numpy.uint64]]]],
[numpy.inexact,
[[numpy.floating,
[numpy.float16, numpy.float32, numpy.float64, numpy.float128]],
[numpy.complexfloating,
[numpy.complex64, numpy.complex128, numpy.complex256]]]]]],
[numpy.flexible,
[[numpy.character, [numpy.bytes_, numpy.str_]],
[numpy.void, [numpy.record]]]],
numpy.bool_,
numpy.datetime64,
numpy.object_]]
The bottom line is, I cannot see any dtype that would support to show something similar to varchar(#). The default of handling strings is to assing them dtype "object" in pandas framework.
In Python in general, you don't have fixed or semi-fixed size string as far as my knowledge goes (you can do fixed size formatting for printing, though).
Upvotes: 1
Reputation: 11105
As far as I know, this is not possible. The varchar
data type exists in the Teradata database system only, and is casted to a sensible pandas data type (str
or unicode
) once you pull it into a DataFrame.
An overview of data types in pandas, numpy, and python: http://pbpython.com/pandas_dtypes.html
Upvotes: 1