quant
quant

Reputation: 4482

How to take the combinations of values in python?

I have the following pandas dataframes:

   import pandas as pd
   import numpy as np
   df = pd.DataFrame({'par': ['par_scl', 'par_scl', 'par_scl', 'par_loc','par_pot'],
                       'var': ['all', 'all', 'all', 'all','var1'],
                       'terc': [1, 2, 3, np.nan,np.nan],
                       'value': [1,2,3,4,5]})

       par   var  terc  value
0  par_scl   all   1.0      1
1  par_scl   all   2.0      2
2  par_scl   all   3.0      3
3  par_loc   all   NaN      4
4  par_pot  var1   NaN      5

    vars_df = pd.DataFrame({'var': ['var1', 'var2']})

    var
0  var1
1  var2

I would like to end up with

final_df = pd.DataFrame({'par':['par_scl','par_scl','par_scl','par_scl','par_scl','par_scl','par_loc','par_loc','par_pot'],
                             'var':['var1','var2','var1','var2','var1','var2','var1','var2','var1'],
                             'terc':[1,1,2,2,3,3,np.nan,np.nan,np.nan],
                             'value':[1,1,2,2,3,3,4,4,5]})

        par   var  terc  value
0  par_scl  var1   1.0      1
1  par_scl  var2   1.0      1
2  par_scl  var1   2.0      2
3  par_scl  var2   2.0      2
4  par_scl  var1   3.0      3
5  par_scl  var2   3.0      3
6  par_loc  var1   NaN      4
7  par_loc  var2   NaN      4
8  par_pot  var1   NaN      5

In words: If the var is all then I would like the combinations of var and terc for every par otherwise just the outer merge.

Is there an efficient way of doing this ?

Upvotes: 1

Views: 47

Answers (1)

jezrael
jezrael

Reputation: 862661

One solution is create cross join by filtered rows, remove helper columns and add rows with no all values:

df1 = df[df['var'] == 'all'].assign(a=1).merge(vars_df.assign(a=1), on='a', suffixes=('_',''))

df2 = (df1.drop(['var_','a'], axis=1)
          .append(df[df['var'] != 'all'], sort=False, ignore_index=True))
print (df2)
       par  terc  value   var
0  par_scl   1.0      1  var1
1  par_scl   1.0      1  var2
2  par_scl   2.0      2  var1
3  par_scl   2.0      2  var2
4  par_scl   3.0      3  var1
5  par_scl   3.0      3  var2
6  par_loc   NaN      4  var1
7  par_loc   NaN      4  var2
8  par_pot   NaN      5  var1

Upvotes: 2

Related Questions