Reputation: 4482
I have the following pandas
dataframes:
import pandas as pd
import numpy as np
df = pd.DataFrame({'par': ['par_scl', 'par_scl', 'par_scl', 'par_loc','par_pot'],
'var': ['all', 'all', 'all', 'all','var1'],
'terc': [1, 2, 3, np.nan,np.nan],
'value': [1,2,3,4,5]})
par var terc value
0 par_scl all 1.0 1
1 par_scl all 2.0 2
2 par_scl all 3.0 3
3 par_loc all NaN 4
4 par_pot var1 NaN 5
vars_df = pd.DataFrame({'var': ['var1', 'var2']})
var
0 var1
1 var2
I would like to end up with
final_df = pd.DataFrame({'par':['par_scl','par_scl','par_scl','par_scl','par_scl','par_scl','par_loc','par_loc','par_pot'],
'var':['var1','var2','var1','var2','var1','var2','var1','var2','var1'],
'terc':[1,1,2,2,3,3,np.nan,np.nan,np.nan],
'value':[1,1,2,2,3,3,4,4,5]})
par var terc value
0 par_scl var1 1.0 1
1 par_scl var2 1.0 1
2 par_scl var1 2.0 2
3 par_scl var2 2.0 2
4 par_scl var1 3.0 3
5 par_scl var2 3.0 3
6 par_loc var1 NaN 4
7 par_loc var2 NaN 4
8 par_pot var1 NaN 5
In words: If the var
is all
then I would like the combinations of var
and terc
for every par
otherwise just the outer merge.
Is there an efficient way of doing this ?
Upvotes: 1
Views: 47
Reputation: 862661
One solution is create cross join by filtered rows, remove helper columns and add rows with no all
values:
df1 = df[df['var'] == 'all'].assign(a=1).merge(vars_df.assign(a=1), on='a', suffixes=('_',''))
df2 = (df1.drop(['var_','a'], axis=1)
.append(df[df['var'] != 'all'], sort=False, ignore_index=True))
print (df2)
par terc value var
0 par_scl 1.0 1 var1
1 par_scl 1.0 1 var2
2 par_scl 2.0 2 var1
3 par_scl 2.0 2 var2
4 par_scl 3.0 3 var1
5 par_scl 3.0 3 var2
6 par_loc NaN 4 var1
7 par_loc NaN 4 var2
8 par_pot NaN 5 var1
Upvotes: 2