user6200992
user6200992

Reputation: 311

How to create dummy variables on Ordinal columns in Python

I am new to Python. I have created dummy columns on categorical column using pandas get_dummies. How to create dummy columns on ordinal column (say column Rating has values 1,2,3...,10)

Upvotes: 1

Views: 2708

Answers (1)

piRSquared
piRSquared

Reputation: 294506

Consider the dataframe df

df = pd.DataFrame(dict(Cats=list('abcdcba'), Ords=[3, 2, 1, 0, 1, 2, 3]))
df

  Cats  Ords
0    a     3
1    b     2
2    c     1
3    d     0
4    c     1
5    b     2
6    a     3

pd.get_dummies
works the same on either column
with df.Cats

pd.get_dummies(df.Cats)

   a  b  c  d
0  1  0  0  0
1  0  1  0  0
2  0  0  1  0
3  0  0  0  1
4  0  0  1  0
5  0  1  0  0
6  1  0  0  0

with df.Ords

   0  1  2  3
0  0  0  0  1
1  0  0  1  0
2  0  1  0  0
3  1  0  0  0
4  0  1  0  0
5  0  0  1  0
6  0  0  0  1

with both

pd.get_dummies(df)

   Ords  Cats_a  Cats_b  Cats_c  Cats_d
0     3       1       0       0       0
1     2       0       1       0       0
2     1       0       0       1       0
3     0       0       0       0       1
4     1       0       0       1       0
5     2       0       1       0       0
6     3       1       0       0       0

Notice that it split out Cats but not Ords

Let's expand on this by adding another Cats2 column and calling pd.get_dummies

pd.get_dummies(df.assign(Cats2=df.Cats)))

   Ords  Cats_a  Cats_b  Cats_c  Cats_d  Cats2_a  Cats2_b  Cats2_c  Cats2_d
0     3       1       0       0       0        1        0        0        0
1     2       0       1       0       0        0        1        0        0
2     1       0       0       1       0        0        0        1        0
3     0       0       0       0       1        0        0        0        1
4     1       0       0       1       0        0        0        1        0
5     2       0       1       0       0        0        1        0        0
6     3       1       0       0       0        1        0        0        0

Interesting, it splits both object columns but not the numeric one.

Upvotes: 2

Related Questions