Reputation: 13814
Suppose, I have a list like bellow:
L =[
[11, ['Blue','Green','Yellow'] , 1],
[21, ['White','Green','Brown'] , 0],
[31, ['Orange','Yellow'] , 0],
[41, ['White','Orange','Brown'], 1],
] ^ ^^^ ^
Id Colors vote
How can I convert this list to a DataFrame where colors are columns too.
Id Blue Green Yellow White Brown Orange vote
0 11 1 1 1 0 0 0 1
1 21 0 1 0 1 1 0 0
2 31 0 0 1 0 0 1 0
3 41 0 0 0 1 1 1 1
Here, df[Id][color] denotes whether color is present in Id or not.
I think, I can do this in iterative way. Is there any simplest way to do so.
Upvotes: 0
Views: 87
Reputation: 13910
Here's one (iterative way) to do it, not sure how to do it vectorized.
from itertools import chain
import pandas as pd
L = [
[11, ['Blue','Green','Yellow'] , 1],
[21, ['White','Green','Brown'] , 0],
[31, ['Orange','Yellow'] , 0],
[41, ['White','Orange','Brown'], 1],
]
colors = set(chain(*(row[1] for row in L)))
def row2obj(row):
obj = {c: int(c in row[1]) for c in colors}
obj['id'] = row[0]
obj['vote'] = row[2]
return obj
df = pd.DataFrame.from_records(row2obj(row) for row in L)
Upvotes: 1