Reputation: 1962
I have a multiple Timeseries in different files and I know that Pandas can infer the frequency of the DateTimeIndex for each:
pd.infer_freq(data.index)
Is there a programmatic way to get the approximate frequency per year from general files. For instance:
'M' -> 12
'BM' -> 12
'B' -> 252
'D' -> 365
Upvotes: 2
Views: 396
Reputation: 59549
Here's one alternative. We'll create a date_range using the provided frequency and then groupby to figure out the most common number that fit into a year. The periods
argument should be large enough such that given the frequency the date range creates many years of data. Really shouldn't need to change it, unless you want ns
or something insanely small. (But for those it will be more efficient to just calculate manually).
def infer_periods_in_year(freq, periods=10**4):
"""
freq : str pandas frequency alias.
periods : numeric, given freq, should create many years.
"""
while True:
try:
s = pd.Series(data=pd.date_range('1970-01-01', freq=freq, periods=periods))
break
# If periods is too large
except (pd.errors.OutOfBoundsDatetime, OverflowError, ValueError):
periods = periods/10
return s.groupby(s.dt.year).size().value_counts().index[0]
infer_periods_in_year('D')
#365
infer_periods_in_year('BM')
#12
infer_periods_in_year('M')
#12
infer_periods_in_year('B')
#261
infer_periods_in_year('W')
#52
infer_periods_in_year('min', periods=10**7)
#525600
Upvotes: 1