maynull
maynull

Reputation: 2046

How to extract rows between 2 times with Pandas?

I want to make sub dataframes out of one dataframe, using its datetime index. For example, if I want to extract rows between 07:00~06:00 and make new dataframes:

import pandas as pd

int_rows  = 24
str_freq  = '180min'

i = pd.date_range('2018-04-09', periods=int_rows, freq=str_freq)

df = pd.DataFrame({'A': [i for i in range(int_rows)]}, index=i)

>>> df
                      A
2018-04-09 00:00:00   0
2018-04-09 03:00:00   1
2018-04-09 06:00:00   2
2018-04-09 09:00:00   3
2018-04-09 12:00:00   4
2018-04-09 15:00:00   5
2018-04-09 18:00:00   6
2018-04-09 21:00:00   7
2018-04-10 00:00:00   8
2018-04-10 03:00:00   9
2018-04-10 06:00:00  10
2018-04-10 09:00:00  11
2018-04-10 12:00:00  12
2018-04-10 15:00:00  13
2018-04-10 18:00:00  14
2018-04-10 21:00:00  15
2018-04-11 00:00:00  16
2018-04-11 03:00:00  17
2018-04-11 06:00:00  18
2018-04-11 09:00:00  19
2018-04-11 12:00:00  20
2018-04-11 15:00:00  21
2018-04-11 18:00:00  22
2018-04-11 21:00:00  23
# new dataframes that I want

                      A
2018-04-09 00:00:00   0
2018-04-09 03:00:00   1

                      A
2018-04-09 06:00:00   2
2018-04-09 09:00:00   3
2018-04-09 12:00:00   4
2018-04-09 15:00:00   5
2018-04-09 18:00:00   6
2018-04-09 21:00:00   7
2018-04-10 00:00:00   8
2018-04-10 03:00:00   9

                      A
2018-04-10 06:00:00  10
2018-04-10 09:00:00  11
2018-04-10 12:00:00  12
2018-04-10 15:00:00  13
2018-04-10 18:00:00  14
2018-04-10 21:00:00  15
2018-04-11 00:00:00  16
2018-04-11 03:00:00  17

                      A
2018-04-11 06:00:00  18
2018-04-11 09:00:00  19
2018-04-11 12:00:00  20
2018-04-11 15:00:00  21
2018-04-11 18:00:00  22
2018-04-11 21:00:00  23

I found between_time method, but it doesn't care about dates. I could iterate over the original dataframe and check each date and time, but I think it's going to be inefficient. Are there any simple ways to do this?

Upvotes: 1

Views: 70

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150805

You can 'shift' the timestamp by 6 hours and group by day:

for k, d in df.groupby((df.index - pd.to_timedelta('6:00:00')).normalize()):
    print(d); print()

Output:

                     A
2018-04-09 00:00:00  0
2018-04-09 03:00:00  1

                     A
2018-04-09 06:00:00  2
2018-04-09 09:00:00  3
2018-04-09 12:00:00  4
2018-04-09 15:00:00  5
2018-04-09 18:00:00  6
2018-04-09 21:00:00  7
2018-04-10 00:00:00  8
2018-04-10 03:00:00  9

                      A
2018-04-10 06:00:00  10
2018-04-10 09:00:00  11
2018-04-10 12:00:00  12
2018-04-10 15:00:00  13
2018-04-10 18:00:00  14
2018-04-10 21:00:00  15
2018-04-11 00:00:00  16
2018-04-11 03:00:00  17

                      A
2018-04-11 06:00:00  18
2018-04-11 09:00:00  19
2018-04-11 12:00:00  20
2018-04-11 15:00:00  21
2018-04-11 18:00:00  22
2018-04-11 21:00:00  23

Upvotes: 1

Related Questions