Python Create Combinations from Multiple Data Frames

Question

Basically, I would like to create a new data frame from some existing data frames by creating all the possible column combinations. This is quite easy in SAS (or expand.grid function in R):

create table combine_var as
select *
from var_1, avar_2;

But I am not sure, what is the equalent way in Python. For instance, my data frame looks like:

var_1= pd.DataFrame.from_items([('val_1', [0.00789, 0.01448, 0.03157])])
var_2= pd.DataFrame.from_items([('val_2', [0.5, 1.0])])

And I expect the output is:

val_1   val_2
0.00789 0.5
0.00789 1.0
0.01448 0.5
0.01448 1.0
0.03157 0.5
0.03157 1.0

Anton Protopopov · Accepted Answer

You could use expand_grid which is mentioned in docs cookbook:

def expand_grid(data_dict):
  rows = itertools.product(*data_dict.values())
  return pd.DataFrame.from_records(rows, columns=data_dict.keys())

expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})

In [107]: expand_grid({'val_1': [0.00789, 0.01448, 0.03157], 'val_2' : [0.5, 1.0]})
Out[107]:
     val_1  val_2
0  0.00789    0.5
1  0.00789    1.0
2  0.01448    0.5
3  0.01448    1.0
4  0.03157    0.5
5  0.03157    1.0

EDIT

For existing dataframes you first will need to create one dictionary from your dataframes. You could combine to one with one of the answers to that question. Example for your case:

expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))

In [122]: expand_grid(dict(var_1.to_dict('list'), **var_2.to_dict('list')))
Out[122]:
     val_1  val_2
0  0.00789    0.5
1  0.00789    1.0
2  0.01448    0.5
3  0.01448    1.0
4  0.03157    0.5
5  0.03157    1.0

Python Create Combinations from Multiple Data Frames

Answers (1)

Related Questions