rhaskett
rhaskett

Reputation: 1962

Pandas Approximate Frequency Per Year of a DateTimeIndex

I have a multiple Timeseries in different files and I know that Pandas can infer the frequency of the DateTimeIndex for each:

pd.infer_freq(data.index)

Is there a programmatic way to get the approximate frequency per year from general files. For instance:

'M' -> 12
'BM' -> 12
'B' -> 252
'D' -> 365

Upvotes: 2

Views: 396

Answers (1)

ALollz
ALollz

Reputation: 59549

Here's one alternative. We'll create a date_range using the provided frequency and then groupby to figure out the most common number that fit into a year. The periods argument should be large enough such that given the frequency the date range creates many years of data. Really shouldn't need to change it, unless you want ns or something insanely small. (But for those it will be more efficient to just calculate manually).

def infer_periods_in_year(freq, periods=10**4):
    """
    freq : str pandas frequency alias.
    periods : numeric, given freq, should create many years. 
    """
    
    while True:
        try:
            s = pd.Series(data=pd.date_range('1970-01-01', freq=freq, periods=periods))
            break
        # If periods is too large
        except (pd.errors.OutOfBoundsDatetime, OverflowError, ValueError): 
            periods = periods/10
    
    return s.groupby(s.dt.year).size().value_counts().index[0]

infer_periods_in_year('D')
#365
infer_periods_in_year('BM')
#12
infer_periods_in_year('M')
#12
infer_periods_in_year('B')
#261
infer_periods_in_year('W')
#52
infer_periods_in_year('min', periods=10**7)
#525600

Upvotes: 1

Related Questions