YOSUKE
YOSUKE

Reputation: 351

Pandas , From Dictionary to Pandas to plot Boolean

I am new Python programmer, and I want from this,

dic = {"word1": ["a","b","c"], "word2": ["b", "d", "e"], "word3": ["a", "f", "c"]}

to, this DataFrame Object.

enter image description here

I tried code like this

df = pd.DataFrame(index=["a","b","c","d","e","f"], columns=[])
for i in result:
print("i",i)
print("v", v)
df2 = pd.DataFrame(i)
df.append(df2)

Please help me to how should I code this

Upvotes: 0

Views: 198

Answers (2)

jezrael
jezrael

Reputation: 863176

First convert dict to Series and then use MultiLabelBinarizer + DataFrame constructor, last cast to boolean:

d = {"word1": ["a","b","c"], "word2": ["b", "d", "e"], "word3": ["a", "f", "c"]}

s = pd.Series(d)

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()

df = pd.DataFrame(mlb.fit_transform(s),columns=mlb.classes_, index=s.index).astype(bool)

Another solution with str.join for joining by | what is default separator in str.get_dummies:

df = s.str.join('|').str.get_dummies().astype(bool)

print (df)
           a      b      c      d      e      f
word1   True   True   True  False  False  False
word2  False   True  False   True   True  False
word3   True  False   True  False  False   True

Upvotes: 2

jpp
jpp

Reputation: 164773

Here is one way using pd.get_dummies:

import pandas as pd

d = {"word1": ["a","b","c"], "word2": ["b", "d", "e"], "word3": ["a", "f", "c"]}

df = pd.DataFrame.from_dict(d, orient='index')
df['values'] = df.values.tolist()

df = df.drop(df.columns[:], 1)\
       .join(pd.get_dummies(df['values'].apply(pd.Series).stack()).sum(level=0))\
       .astype(bool)

Result

           a      b      c      d      e      f
word1   True   True   True  False  False  False
word2  False   True  False   True   True  False
word3   True  False   True  False  False   True

Explanation

  • Create a pd.Series of lists for each word.
  • Apply pd.get_dummies to this series with some manipulation.
  • Convert type from int to bool for display purposes.

Upvotes: 1

Related Questions