Tim Robinson
Tim Robinson

Reputation: 54724

pandas.interval_range for partial interval

I'm using pd.interval_range to generate hourly intervals within a pair of timestamps:

In [1]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:00:00'), freq='h'))
Out[1]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right')]

Is it possible to generate an interval shorter than 1 hour when the end time does not fall on an hour boundary?

In other words, when I move the end time by 1 minute I'm getting this:

In [2]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:01:00'), freq='h'))
Out[2]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right')]

I'd like to get this instead:

In [2]: list(pd.interval_range(pd.Timestamp('2019-02-06 07:00:00'), 
                               pd.Timestamp('2019-02-06 08:01:00'), freq='h'))
Out[2]: [Interval('2019-02-06 07:00:00', '2019-02-06 08:00:00', closed='right'),
         Interval('2019-02-06 08:00:00', '2019-02-06 08:01:00', closed='right')]

Upvotes: 2

Views: 1388

Answers (3)

Tim Robinson
Tim Robinson

Reputation: 54724

Based on Scott's suggestion, here is my solution that puts long stubs at the start and end of the schedule:

def interval_range_with_partial_hour(start_time, end_time, freq, closed='right'):
    if start_time == end_time:
        return pd.IntervalIndex.from_arrays(left=[], right=[], closed=closed)

    index = pd.interval_range(start_time.floor(freq), end_time.ceil(freq), freq=freq, closed=closed)
    assert len(index) > 0

    left, right = index.left.to_series().tolist(), index.right.to_series().tolist()
    assert left[0] <= start_time
    assert right[-1] >= end_time

    left[0] = start_time
    right[-1] = end_time
    return pd.IntervalIndex.from_arrays(left=left, right=right, closed=index.closed)

Upvotes: 3

Scott Boston
Scott Boston

Reputation: 153460

Try:

start = pd.Timestamp('2019-02-06 07:00:00')
end = pd.Timestamp('2019-02-06 09:01:00')

interval_1 = pd.interval_range(start, 
                               end, freq='h')

interval_out = pd.IntervalIndex.from_arrays(interval_1.left.to_series().tolist() +[interval_1.right[-1]], 
                                            interval_1.right.to_series().tolist() +[end])
interval_out

Output:

IntervalIndex([(2019-02-06 07:00:00, 2019-02-06 08:00:00], (2019-02-06 08:00:00, 2019-02-06 09:00:00], (2019-02-06 09:00:00, 2019-02-06 09:01:00]]
              closed='right',
              dtype='interval[datetime64[ns]]')

Upvotes: 2

RubenStrenzke
RubenStrenzke

Reputation: 85

You could find out beforehand what the leftover unit of you interest is. If you are interested in hourly Timedeltas but want to know the leftover in seconds you could for example find out:

delta = pd.Timestamp('2019-02-06 08:03:00') - pd.Timestamp('2019-02-06 07:00:00')
delta.seconds % 3600

In this case you know that there are some 180 seconds remaining and you might be able to deal with that remaining time properly, for example by appending your list by one additional smaller interval.

Upvotes: 0

Related Questions