Xaume
Xaume

Reputation: 281

How to create a mapping dict from a groups of columns

I have a Pandas DataFrame like below:

import numpy as np
import pandas as pd

data = np.random.rand(18).reshape(-1, 6)
data = pd.DataFrame(data, columns = ['var1_x10', 'var2_x10', 'var3_x10', 'var1_x20', 'var2_x20', 'var3_x20'])

    var1_x10    var2_x10    var3_x10    var1_x20    var2_x20    var3_x20
0   0.171464    0.441099    0.936246    0.532478    0.128823    0.211489
1   0.917217    0.544899    0.589996    0.362159    0.774122    0.439542
2   0.094015    0.582171    0.573968    0.200833    0.257705    0.057575

As you can see, columns are in fact 2 transformations of each original columns var1, var2, var3. Now I'd like to create a mapping in form of dictionary with original column names as keys and lists of transformed column names as values:

my_dict = {'var1': ['var1_x10', 'var1_x20'], 
           'var2': ['var2_x10', 'var2_x20'], 
           'var3': ['var3_x10', 'var3_x20']}

How can I do this?

Upvotes: 0

Views: 34

Answers (1)

jezrael
jezrael

Reputation: 862661

Use Series.groupby with convert columns to series with split and convert to list:

d = data.columns.to_series().groupby(lambda x: x.split('_')[0]).apply(list).to_dict()
print (d)
{'var1': ['var1_x10', 'var1_x20'], 
 'var2': ['var2_x10', 'var2_x20'], 
 'var3': ['var3_x10', 'var3_x20']}

Another solution:

from collections import defaultdict
d = defaultdict (list)

for x in data.columns:
    d[x.split('_')[0]].append(x)

print (dict(d))
{'var1': ['var1_x10', 'var1_x20'], 
 'var2': ['var2_x10', 'var2_x20'], 
 'var3': ['var3_x10', 'var3_x20']}

Upvotes: 3

Related Questions