Reputation:
I would like to know how I can get something like this
Net 123 21 41 42 12 21
123 1 0 1 0 0 0
21 0 0 0 0 0 1
41 0 0 1 1 0 0
42 0 0 1 1 0 0
12 0 0 0 0 1 0
21 0 1 0 0 0 0
from the original dataset:
Net L
123 [123,41]
21 [21]
41 [41,42]
42 [42,41]
12 [12]
21 [21]
I thought of explode, but it works only on rows, not on columns.
Upvotes: 1
Views: 71
Reputation: 323276
We can do dot
s=df.drop('Net',1)
df['New']=s.dot(s.columns+',').str[:-1].str.split(',')
df
Out[283]:
Net 123 21 41 42 12 21 New
0 123 1 0 1 0 0 0 [123, 41]
1 21 0 0 0 0 0 1 [21.1]
2 41 0 0 1 1 0 0 [41, 42]
3 42 0 0 1 1 0 0 [41, 42]
4 12 0 0 0 0 1 0 [12]
5 21 0 1 0 0 0 0 [21]
Upvotes: 1
Reputation: 455
I assume the values in your column 'L' are str
(not list
), and each value is separated by a comma. If so, you can:
# create a set of column names
columns = set()
for cols in df.L.unique():
cols = cols.split(',')
for col in cols:
columns.add(col)
# generate columns
for col in columns:
df[col] = df.L.str.contains(col, regex=False)
# change False/True to 0/1
df.loc[:, columns] = df.loc[:, columns].astype(int)
Upvotes: 0