Reputation: 4842
I have a df
:
df = pd.DataFrame.from_dict({('group', ''): {0: 'A',
1: 'A',
2: 'A',
3: 'A',
4: 'A',
5: 'A',
6: 'A',
7: 'A',
8: 'A',
9: 'B',
10: 'B',
11: 'B',
12: 'B',
13: 'B',
14: 'B',
15: 'B',
16: 'B',
17: 'B',
18: 'all',
19: 'all'},
('category', ''): {0: 'Amazon',
1: 'Apple',
2: 'Facebook',
3: 'Google',
4: 'Netflix',
5: 'Tesla',
6: 'Total',
7: 'Uber',
8: 'total',
9: 'Amazon',
10: 'Apple',
11: 'Facebook',
12: 'Google',
13: 'Netflix',
14: 'Tesla',
15: 'Total',
16: 'Uber',
17: 'total',
18: 'Total',
19: 'total'},
(pd.Timestamp('2020-06-29 00:00:00'), 'last_sales'): {0: 195.0,
1: 61.0,
2: 106.0,
3: 61.0,
4: 37.0,
5: 13.0,
6: 954.0,
7: 4.0,
8: 477.0,
9: 50.0,
10: 50.0,
11: 75.0,
12: 43.0,
13: 17.0,
14: 14.0,
15: 504.0,
16: 3.0,
17: 252.0,
18: 2916.0,
19: 2916.0},
(pd.Timestamp('2020-06-29 00:00:00'), 'sales'): {0: 1268.85,
1: 18274.385000000002,
2: 19722.65,
3: 55547.255,
4: 15323.800000000001,
5: 1688.6749999999997,
6: 227463.23,
7: 1906.0,
8: 113731.615,
9: 3219.6499999999996,
10: 15852.060000000001,
11: 17743.7,
12: 37795.15,
13: 5918.5,
14: 1708.75,
15: 166349.64,
16: 937.01,
17: 83174.82,
18: 787625.7400000001,
19: 787625.7400000001},
(pd.Timestamp('2020-06-29 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2020-07-06 00:00:00'), 'last_sales'): {0: 26.0,
1: 39.0,
2: 79.0,
3: 49.0,
4: 10.0,
5: 10.0,
6: 436.0,
7: 5.0,
8: 218.0,
9: 89.0,
10: 34.0,
11: 133.0,
12: 66.0,
13: 21.0,
14: 20.0,
15: 732.0,
16: 3.0,
17: 366.0,
18: 2336.0,
19: 2336.0},
(pd.Timestamp('2020-07-06 00:00:00'), 'sales'): {0: 3978.15,
1: 12138.96,
2: 19084.175,
3: 40033.46000000001,
4: 4280.15,
5: 1495.1,
6: 165548.29,
7: 1764.15,
8: 82774.145,
9: 8314.92,
10: 12776.649999999996,
11: 28048.075,
12: 55104.21000000002,
13: 6962.844999999999,
14: 3053.2000000000003,
15: 231049.11000000002,
16: 1264.655,
17: 115524.55500000001,
18: 793194.8000000002,
19: 793194.8000000002},
(pd.Timestamp('2020-07-06 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-06-28 00:00:00'), 'last_sales'): {0: 96.0,
1: 56.0,
2: 106.0,
3: 44.0,
4: 34.0,
5: 13.0,
6: 716.0,
7: 9.0,
8: 358.0,
9: 101.0,
10: 22.0,
11: 120.0,
12: 40.0,
13: 13.0,
14: 8.0,
15: 610.0,
16: 1.0,
17: 305.0,
18: 2652.0,
19: 2652.0},
(pd.Timestamp('2021-06-28 00:00:00'), 'sales'): {0: 5194.95,
1: 19102.219999999994,
2: 22796.420000000002,
3: 30853.115,
4: 11461.25,
5: 992.6,
6: 188143.41,
7: 3671.15,
8: 94071.705,
9: 6022.299999999998,
10: 7373.6,
11: 33514.0,
12: 35943.45,
13: 4749.000000000001,
14: 902.01,
15: 177707.32,
16: 349.3,
17: 88853.66,
18: 731701.46,
19: 731701.46},
(pd.Timestamp('2021-06-28 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0},
(pd.Timestamp('2021-07-07 00:00:00'), 'last_sales'): {0: 45.0,
1: 47.0,
2: 87.0,
3: 45.0,
4: 13.0,
5: 8.0,
6: 494.0,
7: 2.0,
8: 247.0,
9: 81.0,
10: 36.0,
11: 143.0,
12: 56.0,
13: 9.0,
14: 9.0,
15: 670.0,
16: 1.0,
17: 335.0,
18: 2328.0,
19: 2328.0},
(pd.Timestamp('2021-07-07 00:00:00'), 'sales'): {0: 7556.414999999998,
1: 14985.05,
2: 16790.899999999998,
3: 36202.729999999996,
4: 4024.97,
5: 1034.45,
6: 163960.32999999996,
7: 1385.65,
8: 81980.16499999998,
9: 5600.544999999999,
10: 11209.92,
11: 32832.61,
12: 42137.44500000001,
13: 3885.1499999999996,
14: 1191.5,
15: 194912.34000000003,
16: 599.0,
17: 97456.17000000001,
18: 717745.3400000001,
19: 717745.3400000001},
(pd.Timestamp('2021-07-07 00:00:00'), 'difference'): {0: 0.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0,
10: 0.0,
11: 0.0,
12: 0.0,
13: 0.0,
14: 0.0,
15: 0.0,
16: 0.0,
17: 0.0,
18: 0.0,
19: 0.0}}).set_index(['group','category'])
I am trying to create a level 2 index
called combined
which would be the sum of sales & last_sales
of all categories
except for Facebook
and total
/ Total
.
So that the df
would look like this:
I tried doing it with .loc
but with no success:
s = df_out.stack(0)
s['combined'] = 0
s.loc[(slice(None),[x for x in s.loc[(slice(None),:) if x != 'Facebook']].sum()
Upvotes: 5
Views: 130
Reputation: 260835
You can group on level 0, and concat
the sum per group if the name
is not "all", while adding indexes. Then combine this output with the original data and sort_index
by level=0
:
pd.concat([df,
pd.concat({(name, 'combined'): d.drop(['Facebook', 'total', 'Total'], level=1).sum()
for name,d in df.groupby(level=0) if name != 'all'}, axis=1).T
]).sort_index(level=0)
output:
2020-06-29 00:00:00 2020-07-06 00:00:00 2021-06-28 00:00:00 2021-07-07 00:00:00
last_sales sales difference last_sales sales difference last_sales sales difference last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0 26.0 3978.150 0.0 96.0 5194.950 0.0 45.0 7556.415 0.0
Apple 61.0 18274.385 0.0 39.0 12138.960 0.0 56.0 19102.220 0.0 47.0 14985.050 0.0
Facebook 106.0 19722.650 0.0 79.0 19084.175 0.0 106.0 22796.420 0.0 87.0 16790.900 0.0
Google 61.0 55547.255 0.0 49.0 40033.460 0.0 44.0 30853.115 0.0 45.0 36202.730 0.0
Netflix 37.0 15323.800 0.0 10.0 4280.150 0.0 34.0 11461.250 0.0 13.0 4024.970 0.0
Tesla 13.0 1688.675 0.0 10.0 1495.100 0.0 13.0 992.600 0.0 8.0 1034.450 0.0
Total 954.0 227463.230 0.0 436.0 165548.290 0.0 716.0 188143.410 0.0 494.0 163960.330 0.0
Uber 4.0 1906.000 0.0 5.0 1764.150 0.0 9.0 3671.150 0.0 2.0 1385.650 0.0
combined 371.0 94008.965 0.0 139.0 63689.970 0.0 252.0 71275.285 0.0 160.0 65189.265 0.0
total 477.0 113731.615 0.0 218.0 82774.145 0.0 358.0 94071.705 0.0 247.0 81980.165 0.0
B Amazon 50.0 3219.650 0.0 89.0 8314.920 0.0 101.0 6022.300 0.0 81.0 5600.545 0.0
Apple 50.0 15852.060 0.0 34.0 12776.650 0.0 22.0 7373.600 0.0 36.0 11209.920 0.0
Facebook 75.0 17743.700 0.0 133.0 28048.075 0.0 120.0 33514.000 0.0 143.0 32832.610 0.0
Google 43.0 37795.150 0.0 66.0 55104.210 0.0 40.0 35943.450 0.0 56.0 42137.445 0.0
Netflix 17.0 5918.500 0.0 21.0 6962.845 0.0 13.0 4749.000 0.0 9.0 3885.150 0.0
Tesla 14.0 1708.750 0.0 20.0 3053.200 0.0 8.0 902.010 0.0 9.0 1191.500 0.0
Total 504.0 166349.640 0.0 732.0 231049.110 0.0 610.0 177707.320 0.0 670.0 194912.340 0.0
Uber 3.0 937.010 0.0 3.0 1264.655 0.0 1.0 349.300 0.0 1.0 599.000 0.0
combined 177.0 65431.120 0.0 233.0 87476.480 0.0 185.0 55339.660 0.0 192.0 64623.560 0.0
total 252.0 83174.820 0.0 366.0 115524.555 0.0 305.0 88853.660 0.0 335.0 97456.170 0.0
all Total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
Upvotes: 0
Reputation: 71689
all
in level=0
, similarly drop
the other unwanted level values in level=1
sum
on level=0
to aggregate the frameMultindex
to add the additional level combined
in aggregated frames = df.drop('all').drop(['Facebook', 'total', 'Total'], level=1).sum(level=0)
s.index = pd.MultiIndex.from_product([s.index, ['combined']])
df_out = df.append(s).sort_index()
2020-06-29 00:00:00 2020-07-06 00:00:00 2021-06-28 00:00:00 2021-07-07 00:00:00
last_sales sales difference last_sales sales difference last_sales sales difference last_sales sales difference
group category
A Amazon 195.0 1268.850 0.0 26.0 3978.150 0.0 96.0 5194.950 0.0 45.0 7556.415 0.0
Apple 61.0 18274.385 0.0 39.0 12138.960 0.0 56.0 19102.220 0.0 47.0 14985.050 0.0
Facebook 106.0 19722.650 0.0 79.0 19084.175 0.0 106.0 22796.420 0.0 87.0 16790.900 0.0
Google 61.0 55547.255 0.0 49.0 40033.460 0.0 44.0 30853.115 0.0 45.0 36202.730 0.0
Netflix 37.0 15323.800 0.0 10.0 4280.150 0.0 34.0 11461.250 0.0 13.0 4024.970 0.0
Tesla 13.0 1688.675 0.0 10.0 1495.100 0.0 13.0 992.600 0.0 8.0 1034.450 0.0
Total 954.0 227463.230 0.0 436.0 165548.290 0.0 716.0 188143.410 0.0 494.0 163960.330 0.0
Uber 4.0 1906.000 0.0 5.0 1764.150 0.0 9.0 3671.150 0.0 2.0 1385.650 0.0
combined 371.0 94008.965 0.0 139.0 63689.970 0.0 252.0 71275.285 0.0 160.0 65189.265 0.0
total 477.0 113731.615 0.0 218.0 82774.145 0.0 358.0 94071.705 0.0 247.0 81980.165 0.0
B Amazon 50.0 3219.650 0.0 89.0 8314.920 0.0 101.0 6022.300 0.0 81.0 5600.545 0.0
Apple 50.0 15852.060 0.0 34.0 12776.650 0.0 22.0 7373.600 0.0 36.0 11209.920 0.0
Facebook 75.0 17743.700 0.0 133.0 28048.075 0.0 120.0 33514.000 0.0 143.0 32832.610 0.0
Google 43.0 37795.150 0.0 66.0 55104.210 0.0 40.0 35943.450 0.0 56.0 42137.445 0.0
Netflix 17.0 5918.500 0.0 21.0 6962.845 0.0 13.0 4749.000 0.0 9.0 3885.150 0.0
Tesla 14.0 1708.750 0.0 20.0 3053.200 0.0 8.0 902.010 0.0 9.0 1191.500 0.0
Total 504.0 166349.640 0.0 732.0 231049.110 0.0 610.0 177707.320 0.0 670.0 194912.340 0.0
Uber 3.0 937.010 0.0 3.0 1264.655 0.0 1.0 349.300 0.0 1.0 599.000 0.0
combined 177.0 65431.120 0.0 233.0 87476.480 0.0 185.0 55339.660 0.0 192.0 64623.560 0.0
total 252.0 83174.820 0.0 366.0 115524.555 0.0 305.0 88853.660 0.0 335.0 97456.170 0.0
all Total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
total 2916.0 787625.740 0.0 2336.0 793194.800 0.0 2652.0 731701.460 0.0 2328.0 717745.340 0.0
Upvotes: 4