Nischal Reddy
Nischal Reddy

Reputation: 21

How do I make a panda frames values across multiple columns, its columns

I have the following dataframe loaded up in Pandas.

print(pandaDf)

id col1 col2 col3 12a a b d 22b d a b 33c c a b

I am trying to convert the values across multiple rows into its columns so the output would be like this :

Desired output:

id a b c d 12a 1 1 0 1 22b 1 1 0 0 33c 1 1 1 0

I have tried adding in a value column where the value = 1 and using a pivot table

pandaDf['value'] = 1 column = ['col1', 'col2', 'col3'] pandaDf.pivot_table(index = 'id', value = 'value', columns = column)

However, the resulting data frame is a multilevel index and the pandaDf.pivot() method does not allow multiple column values.

Please advise about how I could do this with an output of a single level index.

Thanks for taking the time to read this and I apologize if I have made any formatting errors in posting the question. I am still learning the proper stackoverflow syntax.

Upvotes: 2

Views: 73

Answers (2)

Scott Boston
Scott Boston

Reputation: 153510

You can use One-Hot Encoding to solve this problem. Here is one way to do this pd.get_dummies and some multiindex flatten and sum:

df1 = df.set_index('id')
df_out = pd.get_dummies(df1)
df_out.columns = df_out.columns.str.split('_', expand=True)
df_out = df_out.sum(level=1, axis=1).reset_index()
print(df_out)

Output:

    id  a  c  d  b
0  12a  1  0  1  1
1  22b  1  0  1  1
2  33c  1  1  0  1

Upvotes: 3

BENY
BENY

Reputation: 323326

Using get_dummies

pd.get_dummies(df.set_index('id'),prefix='', prefix_sep='').sum(level=0,axis=1)
Out[81]: 
     a  c  d  b
id             
12a  1  0  1  1
22b  1  0  1  1
33c  1  1  0  1

Upvotes: 2

Related Questions