Aditya Sharma
Aditya Sharma

Reputation: 157

Does get_dummies() function change the dtype of a column?

I have a data frame column ['Cause'] with dtype as object having the below values :

Cause Water Fire Earthquake Flood

Now when I am using get_dummies() function on this column, I got 4 additional column as below with binary values:

Water | Fire | Earthquake | Flood

My query is , all these additional 4 columns has a data type as uint8.Do I need to convert it into int64.

Upvotes: 1

Views: 5480

Answers (4)

krishna parasad
krishna parasad

Reputation: 21

You don't need to convert again. While converting it to get_dummies() you can define dtype :

pd.get_dummies(['Column_name'], dtype=np.int64)

Upvotes: 2

Yash Nag
Yash Nag

Reputation: 1255

uint8 is the default data type that pandas forms the 'dummified columns' with.
You can always change it to different dtype.

But remember, that dtype will be assigned to all the dummified columns. For example:

pd.get_dummies(df, columns=['col1'], dtype='str')

would create dummified columns, all with the datatype str.

Upvotes: 1

Biswanath
Biswanath

Reputation: 9185

Yes, by default if you don't mention dtype it will be converted to uint8.

You can do something like this

pd.get_dummies(..., dtype=int64)

Upvotes: 5

U13-Forward
U13-Forward

Reputation: 71600

Well, it depends on you, it still behaves like a integer...

So you can use it like any other integer, but also you should know there is a str.get_dummies which is default already int64:

>>> df['Cause'].str.get_dummies()
   Earthquake  Fire  Flood  Water
0           0     0      0      1
1           0     1      0      0
2           1     0      0      0
3           0     0      1      0
>>> df['Cause'].str.get_dummies().dtypes
Earthquake    int64
Fire          int64
Flood         int64
Water         int64
dtype: object

Upvotes: 1

Related Questions