GAO
GAO

Reputation: 1

To split python DataFrame

I have a big DataFrame with a date_time as index , for exemple : 2022-09-30 15:45:00. A row with data of one day every minute

row1  2022-09-10 10:05      data1 data2 data3
row2  2022-09-10 10:06      data4 data5 data6    etc...

The index range being : “2022-09-01” to “2022-10-31” (for exemple), it could be several weeks, several months, even a few years (2 or 3)

I want to split this DataFrame in several smaller DataFrames, each one : Containing only data of the same day The name of each new DataFrame must include reference of the date, for exemple XXX-2022-09-15 Each DataFrame has only data (or part of the data) of the same day: from 9:45 to 10:30 (for ex)
The hour range being the same for all the new DataFrames created

I tried with DateTimeIndex but I had problems to understand the way it works

Upvotes: 0

Views: 56

Answers (1)

Pierre D
Pierre D

Reputation: 26321

Edits: make the result a dict, and truncate each day

# times to truncate to each day
t0, t1 = pd.Timedelta('09:45:00'), pd.Timedelta('10:30:00')

byday = {
    f'{day:%Y-%m-%d}': d.truncate(before=day+t0, after=day+t1)
    for day, d in df.groupby(pd.Grouper(freq='D'))
}

Example

ix = pd.date_range(
    # note: partial day at the beginning
    '2022-09-10 01:44:00', '2023-01-01', freq='min', inclusive='left')
df = pd.DataFrame(
    np.random.uniform(0,1, (len(ix), 3)), columns=list('abc'),
    index=ix)

# code above

>>> list(byday)[0]
'2022-09-10'

>>> list(byday)[-1]
'2022-12-31'

>>> byday['2022-09-10']
                            a         b         c
2022-09-10 09:45:00  0.247076  0.687310  0.597638
2022-09-10 09:46:00  0.307722  0.753229  0.329068
2022-09-10 09:47:00  0.865848  0.075505  0.268435
...                       ...       ...       ...
2022-09-10 10:28:00  0.383779  0.523062  0.622288
2022-09-10 10:29:00  0.633321  0.105336  0.570100
2022-09-10 10:30:00  0.123475  0.044391  0.802064

>>> byday['2022-12-31']
                            a         b         c
2022-12-31 09:45:00  0.189360  0.812205  0.466228
2022-12-31 09:46:00  0.471459  0.490481  0.903464
2022-12-31 09:47:00  0.279801  0.885283  0.275511
...                       ...       ...       ...
2022-12-31 10:28:00  0.558043  0.692632  0.122300
2022-12-31 10:29:00  0.034136  0.037672  0.020361
2022-12-31 10:30:00  0.205017  0.721944  0.030551

Upvotes: 1

Related Questions