Reputation: 73
I'm having a DataFrame like this:
| date | dimension A| dimension B| dimension C| dimension D| counts |
+----------+------------+------------+------------+------------+------------+
| 1-2-2001 | a1 | b1 | c1 | d1 | 52 |
| 1-1-2001 | a2 | b2 | c2 | d2 | 33 |
| 1-2-2001 | a3 | b3 | c3 | d3 | 41 |
| 1-1-2001 | a4 | b4 | c4 | d4 | 19 |
What I want to do is let python do df.groupby automatically with each combination of two dimensions, and create a new dataframe with every result. i.e. the following:
df1 = df.groupby(['date', 'dimension A']).sum()
df2 = df.groupby(['date', 'dimension B']).sum()
...
df5 = df.groupby(['dimension A', 'dimension B']).sum()
...
df10 = df.groupby(['dimension C', 'dimension D']).sum()
What should I do?
Upvotes: 0
Views: 46
Reputation: 17824
You can use the function combinations
to generate different column combinations. Then you can add GroupBy
objects or DataFrame
s to a list (dictionary):
from itertools import combinations
dfs = []
for i, j in combinations(df.columns, 2):
dfs.append(df.groupby([i, j])) # or df.groupby([i, j]).mean()
You can also use a list (dict) comprehenstion:
[df.groupby([i, j]) for i, j in combinations(df.columns, 2)]
Upvotes: 4