Reputation: 9338
Raw data as below:
all_names = ['Darren','John','Kate','Mike','Nancy']
list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']
What I want to achieve is a data-frame with columns indicating which names in the lists appeared (1 for positive, 0 for negative), such as:
I tried a way which is to loop the lists and create new lists by adding 0 for the missing ones, otherwise 1.
It is clumsy and troublesome, especially when the number of lists increased.
new_list_0 = []
for _ in all_names:
if _ not in list_0:
new_list_0.append(0)
else:
new_list_0.append(1)
new_list_1 = []
for _ in all_names:
if _ not in list_1:
new_list_1.append(0)
else:
new_list_1.append(1)
import pandas as pd
data = [all_names, new_list_0,new_list_1]
column_names = data.pop(0)
df = pd.DataFrame(data, columns=column_names)
Output:
Darren John Kate Mike Nancy
0 0 1 0 1 0
1 0 0 1 0 1
What's the smart way?
Upvotes: 1
Views: 47
Reputation: 523
Using normal pandas operations and list comprehensions.
import pandas as pd
all_names = ['Darren','John','Kate','Mike','Nancy']
list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']
lists = [list_0, list_1]
df = pd.DataFrame(columns=all_names)
for item in lists:
df = df.append(pd.Series([int(name in item) for name in all_names], index=df.columns), ignore_index=True)
print(df)
Output
Darren John Kate Mike Nancy
0 0 1 0 1 0
1 0 0 1 0 1
Upvotes: 1
Reputation: 8302
Using, dict.fromkeys()
+ fillna
import pandas as pd
all_names = ['Darren', 'John', 'Kate', 'Mike', 'Nancy']
list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']
df = (
pd.DataFrame([dict.fromkeys(x, 1) for x in [list_0, list_1]],
columns=all_names)
).fillna(0)
Darren John Kate Mike Nancy
0 0.0 1.0 0.0 1.0 0.0
1 0.0 0.0 1.0 0.0 1.0
Upvotes: 1
Reputation: 4618
You can use pandas series:
x = pd.Series(all_names)
pd.concat([x.isin(list_0), x.isin(list_1)], axis=1).astype(int).T
Upvotes: 1
Reputation: 323226
Let us try str.get_dummies
and reindex
df=pd.Series([list_0,list_1]).str.join(',').str.get_dummies(',').reindex(columns=all_names,fill_value=0)
Out[160]:
Darren John Kate Mike Nancy
0 0 1 0 1 0
1 0 0 1 0 1
Upvotes: 2