Reputation: 4434
Basically, I would like to create a new data frame from some existing data frames by creating all the possible column combinations. This is quite easy in SAS
(or expand.grid
function in R
):
create table combine_var as
select *
from var_1, avar_2;
But I am not sure, what is the equalent way in Python. For instance, my data frame looks like:
var_1= pd.DataFrame.from_items([('val_1', [0.00789, 0.01448, 0.03157])])
var_2= pd.DataFrame.from_items([('val_2', [0.5, 1.0])])
And I expect the output is:
val_1 val_2
0.00789 0.5
0.00789 1.0
0.01448 0.5
0.01448 1.0
0.03157 0.5
0.03157 1.0
Upvotes: 3
Views: 1378
Reputation: 31672
You could use expand_grid
which is mentioned in docs cookbook:
def expand_grid(data_dict):
rows = itertools.product(*data_dict.values())
return pd.DataFrame.from_records(rows, columns=data_dict.keys())
expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})
In [107]: expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})
Out[107]:
val_1 val_2
0 0.00789 0.5
1 0.00789 1.0
2 0.01448 0.5
3 0.01448 1.0
4 0.03157 0.5
5 0.03157 1.0
EDIT
For existing dataframes you first will need to create one dictionary from your dataframes. You could combine to one with one of the answers to that question. Example for your case:
expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))
In [122]: expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))
Out[122]:
val_1 val_2
0 0.00789 0.5
1 0.00789 1.0
2 0.01448 0.5
3 0.01448 1.0
4 0.03157 0.5
5 0.03157 1.0
Upvotes: 4