Reputation: 177
Suppose I have this df:
col1 col2 col3 col4
A B B A
B C C D
D null D null
And a list
list1 = ["A","B","C","D"]
How do I create a new df with the boolean representation of the values of the list as first column if the value is in the old df columns?
Expected output:
list1 col1 col2 col3 col4
A 1 0 0 1
B 1 1 1 0
C 0 1 1 0
D 1 0 1 1
Upvotes: 1
Views: 229
Reputation: 13387
Try:
res = pd.DataFrame(index=list1, columns=df.columns).fillna(0)
res.loc[:, :] = df.stack().reset_index().pivot_table(index=0, columns="level_1", aggfunc="count").notna().astype(int).droplevel(0, axis=1)
Outputs:
>>> res
col1 col2 col3 col4
A 1 0 0 1
B 1 1 1 0
C 0 1 1 0
D 1 0 1 1
Upvotes: 1
Reputation: 150745
This is essentially crosstab:
df.melt().groupby('value')['variable'].value_counts().unstack(fill_value=0)
Output:
variable col1 col2 col3 col4
value
A 1 0 0 1
B 1 1 1 0
C 0 1 1 0
D 1 0 1 1
Upvotes: 1