Reputation: 21
I would like to create dummies based on column values...
This is what the df looks like
I want to create this
This is so far my approach
import pandas as pd
df =pd.read_csv('test.csv')
v =df.Values
v_set=set()
for line in v:
line=line.split(',')
for x in line:
if x!="":
v_set.add(x)
else:
continue
for val in v_set:
df[val]=''
By the above code I am able to create columns in my df like this
How do I go about updating the row values to create dummies? This is where I am having problems.
Thanks in advance.
Upvotes: 1
Views: 513
Reputation: 7994
You could use pandas.Series.str.get_dummies
. This will alllow you to split the column directly with a delimiter.
df = pd.concat([df.ID, df.Values.str.get_dummies(sep=",")], axis=1)
ID 1 2 3 4
0 1 1 1 0 0
1 2 0 0 1 1
df.Values.str.get_dummies(sep=",")
will generate
1 2 3 4
0 1 1 0 0
1 0 0 1 1
Then, we do a pd.concat
to glue to df together.
Upvotes: 1