Reputation: 109
I'm doing these in python, but I thought maybe there is a faster way to do this.
After doing pd.get_dummies(dataset[a column name])
for ordinal variables, I'm manually checking number of columns and putting 1, 2, 3,.. at the end of each new column names.
In python, could we write more efficient codes so that python gets dummies for ordinal variables and rename column names with numbers attached in order? (i.e. if I give g, it renames columns as g1, g2, g3 columns)
dummie_g = pd.get_dummies(d["gen"])
dummie_g.describe()
dummie_g.columns = ['g1','g2','g3']
dummie_e=pd.get_dummies(d["educ"])
dummie_e.describe()
dummie_e.columns = ['e1','e2','e3','e4']
dummie_a=pd.get_dummies(d["type"])
dummie_a.describe()
dummie_a.columns=['a1','a2','a3','a4','a5','a6']
dummie_n=pd.get_dummies(d["name"])
dummie_n.describe()
dummie_n.columns=['n1','n2']
dummie_dpt=pd.get_dummies(d["dpt"])
dummie_dpt.describe()
dummie_dpt.columns=['h1','h2','h3','h4','h5','h6','h7','h8','h9','h10','h11','h12','h13','h14','h15']
Upvotes: 2
Views: 4935
Reputation: 30605
There is a parameter called prefix
for get_dummies
to add the prefix for the columns after getting dummies. You can use it like
pd.get_dummies(d["gen"],prefix='g')
An improved version of you code might be :
dfs = {}
# use dicts over repeating n varaibles.
for i,j in zip(["gen","educ","type","name","dpt"],["g","e","a","n","h"]):
dfs['dummies_'+j] = pd.get_dummies(d[i],prefix=j)
Upvotes: 5