OliverGras
OliverGras

Reputation: 48

Pandas - MultiIndex resample - I don´t want to lose informations from other indexes

I´ve got the following DataFrame:

                        value
A   B
111 2024-03-22 00:00:00 1
111 2024-03-22 01:00:00 2
111 2024-03-22 02:00:00 3
222 2024-03-22 00:00:00 4
222 2024-03-22 01:00:00 5
222 2024-03-22 02:00:00 6

Now I want to resample and sum index B to days and would expect the following result:

                        value
A   B
111 2024-03-22 00:00:00 6
222 2024-03-22 00:00:00 15

How can I achieve something like that?

Another Example would be the following:

                        value
A   B
111 2024-03-22 00:00:00 1
111 2024-03-22 01:00:00 2
111 2024-03-22 02:00:00 3
222 2024-03-22 00:00:00 4
222 2024-03-22 01:00:00 5
222 2024-03-22 02:00:00 6
333 2024-03-22 05:00:00 7

Of which I want the following result with resampling by 1h:

                        value
A   B
111 2024-03-22 00:00:00 1
111 2024-03-22 01:00:00 2
111 2024-03-22 02:00:00 3
111 2024-03-22 03:00:00 0
111 2024-03-22 04:00:00 0
111 2024-03-22 05:00:00 0
222 2024-03-22 00:00:00 4
222 2024-03-22 01:00:00 5
222 2024-03-22 02:00:00 6
222 2024-03-22 03:00:00 0
222 2024-03-22 04:00:00 0
222 2024-03-22 05:00:00 0
333 2024-03-22 00:00:00 0
333 2024-03-22 01:00:00 0
333 2024-03-22 02:00:00 0
333 2024-03-22 03:00:00 0
333 2024-03-22 04:00:00 0
333 2024-03-22 05:00:00 7

Pandas Version: 2.0.1

I tried using level on resample but that way I lose Index A.

I have the same issue when I have two timestamps in the index and want one to be resampled to days and the other to hours.

I´ve looked at other answers of related questions here but couldn´t find a way to get them working for me.

I´m confusing myself already and it could be that I just can´t see the correct way to achieve this. :)

If you need anymore information please let me know.

Thanks in advance.

Regards Oliver

Upvotes: 1

Views: 60

Answers (1)

James
James

Reputation: 36781

You need to groupby before you resample to preserve the A index.

import pandas as pd

df = pd.DataFrame.from_dict({'value': 
 {(111, pd.Timestamp('2024-03-22 00:00:00')): 1,
  (111, pd.Timestamp('2024-03-22 01:00:00')): 2,
  (111, pd.Timestamp('2024-03-22 02:00:00')): 3,
  (222, pd.Timestamp('2024-03-22 00:00:00')): 4,
  (222, pd.Timestamp('2024-03-22 01:00:00')): 5,
  (222, pd.Timestamp('2024-03-22 02:00:00')): 6}}
)

df.groupby(level=0).resample('d', level=1).sum()
# returns:
                value
A   B
111 2024-03-22      6
222 2024-03-22     15

Upvotes: 3

Related Questions