Reputation: 349
I have created a really big dataframe in pandas
like similar to the following:
0 1
user
0 product4 product0
1 product3 product1
I want to use something, like pd.get_dummies()
, in such a way that the final df
would be like:
product0 product1 product2 product3 product4
user
0 1 0 0 0 1
1 0 1 0 1 0
instead of getting the following from pd.get_dummies()
:
0_product3 0_product4 1_product0 1_product1
user
0 0 1 1 0
1 1 0 0 1
In summary, I do not want that the rows are combined into the binary columns. Thanks a lot!
Upvotes: 2
Views: 44
Reputation: 111
df = pd.get_dummies(df, prefix='', prefix_sep='') # remove prefix from dummy column names and underscore
df = df.sort_index(axis=1) # order data by column names
Upvotes: 1
Reputation: 76917
Use reindex
with get_dummies
In [539]: dff = pd.get_dummies(df, prefix='', prefix_sep='')
In [540]: s = dff.columns.str[-1].astype(int)
In [541]: cols = 'product' + pd.RangeIndex(s.min(), s.max()+1).astype(str)
In [542]: dff.reindex(columns=cols, fill_value=0)
Out[542]:
product0 product1 product2 product3 product4
user
0 1 0 0 0 1
1 0 1 0 1 0
Upvotes: 2