bismo
bismo

Reputation: 1439

How to explode a list into new columns pandas

Let's say I have the following df

  x
1 ['abc','bac','cab']
2 ['bac']
3 ['abc','cab']

And I would like to take each element of each list and put it into a new row, like so

  abc bac cab
1  1    1  1
2  0    1  0
3  1    0  1

I have referred to multiple links but can't seem to get this correctly.

Thanks!

Upvotes: 2

Views: 1256

Answers (2)

BENY
BENY

Reputation: 323236

I will do

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()

s = pd.DataFrame(mlb.fit_transform(df['x']), columns=mlb.classes_, index=df.index)

Upvotes: 1

Henry Ecker
Henry Ecker

Reputation: 35636

One approach with str.join + str.get_dummies:

out = df['x'].str.join(',').str.get_dummies(',')

out:

   abc  bac  cab
0    1    1    1
1    0    1    0
2    1    0    1

Or with explode + pd.get_dummies then groupby max:

out = pd.get_dummies(df['x'].explode()).groupby(level=0).max()

out:

   abc  bac  cab
0    1    1    1
1    0    1    0
2    1    0    1

Can also do pd.crosstab after explode if want counts instead of dummies:

s = df['x'].explode()
out = pd.crosstab(s.index, s)

out:

x      abc  bac  cab
row_0               
0        1    1    1
1        0    1    0
2        1    0    1

*Note output is the same here, but will be count if there are duplicates.


DataFrame:

import pandas as pd

df = pd.DataFrame({
    'x': [['abc', 'bac', 'cab'], ['bac'], ['abc', 'cab']]
})

Upvotes: 3

Related Questions