Reputation: 281
I have a Pandas DataFrame like below:
import numpy as np
import pandas as pd
data = np.random.rand(18).reshape(-1, 6)
data = pd.DataFrame(data, columns = ['var1_x10', 'var2_x10', 'var3_x10', 'var1_x20', 'var2_x20', 'var3_x20'])
var1_x10 var2_x10 var3_x10 var1_x20 var2_x20 var3_x20
0 0.171464 0.441099 0.936246 0.532478 0.128823 0.211489
1 0.917217 0.544899 0.589996 0.362159 0.774122 0.439542
2 0.094015 0.582171 0.573968 0.200833 0.257705 0.057575
As you can see, columns are in fact 2 transformations of each original columns var1
, var2
, var3
. Now I'd like to create a mapping in form of dictionary with original column names as keys and lists of transformed column names as values:
my_dict = {'var1': ['var1_x10', 'var1_x20'],
'var2': ['var2_x10', 'var2_x20'],
'var3': ['var3_x10', 'var3_x20']}
How can I do this?
Upvotes: 0
Views: 34
Reputation: 862661
Use Series.groupby
with convert columns to series with split
and convert to list
:
d = data.columns.to_series().groupby(lambda x: x.split('_')[0]).apply(list).to_dict()
print (d)
{'var1': ['var1_x10', 'var1_x20'],
'var2': ['var2_x10', 'var2_x20'],
'var3': ['var3_x10', 'var3_x20']}
Another solution:
from collections import defaultdict
d = defaultdict (list)
for x in data.columns:
d[x.split('_')[0]].append(x)
print (dict(d))
{'var1': ['var1_x10', 'var1_x20'],
'var2': ['var2_x10', 'var2_x20'],
'var3': ['var3_x10', 'var3_x20']}
Upvotes: 3