Reputation: 96300
Consider a dictionary holding iterables of different length:
{'column_1': range(10),
'column_2': range(3),
'column_3': ['foo']}
I would like to create a dataframe that includes the full cartesian product of these entries. That is:
column 1, column_2, column_3
0 0 'foo'
0 1 'foo'
0 2 'foo'
1 0 'foo'
1 1 'foo'
1 2 'foo'
...
9 2 'foo'
How can I do this in Pandas? Perhaps using collections
?
Upvotes: 3
Views: 2179
Reputation: 5433
This is "a bit" late, but here is a full pandas solution.
First construct a MultiIndex from the cartesian product of the dictionary values, using pandas.MultiIndex.from_product
. The dictionary keys are used to name the index levels.
Then convert each index level to a DataFrame column using the pandas.MultiIndex.to_frame
import pandas as pd
d = {
'column_1': range(10),
'column_2': range(3),
'column_3': ['foo']
}
df = pd.MultiIndex.from_product(d.values(), names=d.keys()).to_frame(index=False)
Output
>>> df
column_1 column_2 column_3
0 0 0 foo
1 0 1 foo
2 0 2 foo
3 1 0 foo
4 1 1 foo
5 1 2 foo
6 2 0 foo
7 2 1 foo
8 2 2 foo
9 3 0 foo
10 3 1 foo
11 3 2 foo
12 4 0 foo
13 4 1 foo
14 4 2 foo
15 5 0 foo
16 5 1 foo
17 5 2 foo
18 6 0 foo
19 6 1 foo
20 6 2 foo
21 7 0 foo
22 7 1 foo
23 7 2 foo
24 8 0 foo
25 8 1 foo
26 8 2 foo
27 9 0 foo
28 9 1 foo
29 9 2 foo
Upvotes: 1
Reputation: 180441
Not overly familiar with pandas but this may work:
d={'column_1': range(10),
'column_2': range(3),
'column_3': ['foo']}
import pandas as pd
from collections import OrderedDict
from itertools import product
od = OrderedDict(sorted(d.items()))
cart = list(product(*od.values()))
df = pd.DataFrame(cart,columns=od.keys())
print(df)
column_1 column_2 column_3
0 0 0 foo
1 0 1 foo
2 0 2 foo
3 1 0 foo
4 1 1 foo
5 1 2 foo
6 2 0 foo
7 2 1 foo
8 2 2 foo
9 3 0 foo
10 3 1 foo
11 3 2 foo
12 4 0 foo
13 4 1 foo
14 4 2 foo
15 5 0 foo
16 5 1 foo
17 5 2 foo
18 6 0 foo
19 6 1 foo
20 6 2 foo
21 7 0 foo
22 7 1 foo
23 7 2 foo
24 8 0 foo
25 8 1 foo
26 8 2 foo
27 9 0 foo
28 9 1 foo
29 9 2 foo
Upvotes: 3