Reputation: 5594
I have a DataFrame with two indexes; it looks like this:
>>> by_hour
pr da delta delta_sq
node timestamputc
A 1 20.540423 21.093659 0.553237 9.869976
B 1 17.675580 18.183104 0.507524 11.474762
C 1 16.257307 16.961944 0.704638 68.023460
... ... ... ... ...
X 24 20.649155 20.805145 0.155990 43.176084
Y 24 20.677271 21.183925 0.506655 47.746125
Z 24 21.455556 21.725556 0.270000 39.393092
[60312 rows x 4 columns]
I have another DataFrame with a single index, identical to the level 0 index of by_hour
:
>>> nodes
type
node
A type 1
B type 1
C type 2
... ...
X type 3
Y type 1
Z type 2
[2513 rows x 1 columns]
I would like to group the first DataFrame by the "type" column of the second DataFrame, while preserving the level 1 index, to get an output like this:
pr da delta delta_sq
type timestamputc
type 1 1
2
...
type 2 1
2
...
type n 1
...
24
How can I do this? Is it possible without creating an intermediate DataFrame?
Upvotes: 1
Views: 537
Reputation:
This aggregates the DataFrame by type and timestamp:
node_type = nodes.loc[by_hour.index.get_level_values('node'), 'type'].values
timestamp = by_hour.index.get_level_values('timestamputc')
by_hour.groupby([node_type, timestamp]).sum()
pr da delta delta_sq
timestamputc
type 1 1 38.216003 39.276763 1.060761 21.344738
24 20.677271 21.183925 0.506655 47.746125
type 2 1 16.257307 16.961944 0.704638 68.023460
24 21.455556 21.725556 0.270000 39.393092
type 3 24 20.649155 20.805145 0.155990 43.176084
Upvotes: 5