Using pandas.get_dummies

Question

So essentially I have a data frame with a bunch of columns, some of which I want to keep (stored in to_keep) and some other columns that I want to create categorical variables for using pandas.get_dummies (these are stored in to_change).

However, I can't seem to get the syntax of how to do this down, and all the examples I have seen (i.e here: http://blog.yhat.com/posts/logistic-regression-and-python.html), don't seem to help.

Here's what I have at present:

new_df = df.copy()
dummies= pd.get_dummies(new_df[to_change])
new_df = new_df[to_keep].join(dummies)
return new_df

Any help on where I am going wrong would be appreciated, as the problem I keep running into is that this only adds categorical variables for the first column in to_change.

Ami Tavory · Accepted Answer

Didn't understand the problem completely, I must say.

However, say your DataFrae is df, and you have a list of columns to_make_categorical.

The DataFrame with the non-categorical columns, is

wo_categoricals = df[[c for c in list(df.columns) if c not in to_make_categorical]]

The DataFrames of the categorical expansions are

categoricals = [pd.get_dummies(df[c], prefix=c) for c in to_make_categorical]

Now you could just concat them horizontally:

pd.concat([wo_categoricals] + categoricals, axis=1)

Using pandas.get_dummies

Answers (1)

Related Questions