Borut Flis
Borut Flis

Reputation: 16415

How to create binary representations of words in pandas column?

I have a column which contains lists of variable sizes. The lists contain a limited amount of short text values. Around 60 unique values all together.

0    ["AC","BB"]
1    ["AD","CB", "FF"]
2    ["AA","CC"]
3    ["CA","BB"]
4    ["AA"]

I want to make this values columns in my data-frame and the values of this columns would be 1 if the values is in this row and 0 if not.

I know I could expand the list and than call unique and set those as new columns. But after that I don't know what to do?

Upvotes: 1

Views: 162

Answers (1)

Nk03
Nk03

Reputation: 14949

Here's one way:

df = pd.get_dummies(df.explode('val')).sum(level = 0)

NOTE: Here (level=0) is kind of like a grouping operation that uses an index for grouping stuff. So, I prefer to use this after exploding the dataframe.

Upvotes: 1

Related Questions