Reputation: 3208
Assuming I have the following pandas.DataFrame
:
df = pd.DataFrame({'id': [1, 2, 3], 'val': [5, 5, 10],
'trig_aaa': [1, 0, 1], 'trig_bbb': [0, 1, 1], 'trig_ccc': [0, 0, 1]})
print(df)
id val trig_aaa trig_bbb trig_ccc
0 1 5 1 0 0
1 2 5 0 1 0
2 3 10 1 1 1
I'd like to turn it to the following df:
id val trig
0 1 5 [aaa]
1 2 5 [bbb]
2 3 10 [aaa, bbb, ccc]
is there an elegant (hopefully, functionality pre-built) in Pandas/Python/Numpy?
After looking at jpps' comment, a better processing to the DataFrame would looks like so:
id val trig
0 1 5 aaa
1 2 5 bbb
2 3 10 aaa
3 3 10 bbb
4 3 10 ccc
Upvotes: 1
Views: 260
Reputation: 164843
You can use pd.melt
:
# rename columns and melt dataframe
df.columns = [i if '_' not in i else i.split('_')[1] for i in df]
res = pd.melt(df, id_vars=['id', 'val'], var_name='trig')
# filter for 1 values and sort
res = res[res['value'].eq(1)].sort_values('id').iloc[:, :-1].reset_index(drop=True)
print(res)
id val trig
0 1 5 aaa
1 2 5 bbb
2 3 10 aaa
3 3 10 bbb
4 3 10 ccc
Upvotes: 2