uday
uday

Reputation: 6713

pandas dtype conversion from object to string

I have a csv file that has a few columns which are numbers and few that are string. When I try myDF.dtypes it shows me all the string columns as object.

  1. Someone asked a related question before here about why this is done. Is it possible to recast the dtype from object to string?

  2. Also, in general, is there any easy way to recast the dtype from int64 and float64 to int32 and float32 and save on the size of the data (in memory / on disk)?

Upvotes: 10

Views: 19695

Answers (2)

Anshul Bisht
Anshul Bisht

Reputation: 1654

df = your dataframe object with values
print('dtype in object form :')
print(df.dtypes[df.columns[0]])    // output: dtype('O')
print('\ndtype in string')
print(str(df.dtypes[df.columns[0]]))    // output: 'object'

Upvotes: 0

Jeff
Jeff

Reputation: 128918

All strings are represented as variable-length (which is what object dtype is holding). You can do series.astype('S32') if you want; but it will be recast if you then store it in a DataFrame or do much with it. This is for simplicity.

Certain serialization formats, e.g. HDFStore stores the strings as fixed-length strings on disk though.

You can series.astype(int32) if you would like and it will store as the new type.

Upvotes: 3

Related Questions