Reputation: 285
I have a time series of monthly values and I would like to calculate the number of days in that month (to then divide the number by to get a daily average for that month).
I have used calendar.monthrange()
to calculate this by looping through the values, but I was looking at the pandas.DataFrame.apply
method (https://medium.com/@rtjeannier/pandas-101-cont-9d061cb73bfc) and wondering how it was possible to use that instead of a loop?
The code below gives me the output I would like, but for efficiency (and learning) purposes I'd like to understand the better way of doing this by using the apply method rather than a loop.
import pandas as pd
import calendar
df = pd.DataFrame()
df['temp'] = pd.date_range(start='01-Jan-2000', end='31-Dec-2018', freq='MS')
df['value'] = 5
df.set_index('temp', inplace=True)
days_list = []
for val in df.index:
days_list.append(calendar.monthrange(val.year, val.month)[1])
df['days_in_month'] = days_list
I can find the number of days for one row of the index nice and easily by using this:
calendar.monthrange(df.index[0].year, df.index[0].month)[1]
But then if I tried to do it for a number of values (see below) it throws an error, I am missing the methodology on how to get between the two.
calendar.monthrange(df.index.year, df.index.month)[1]
The end goal would to create a column (like the loop does) but more efficiently and without the needless creation of a list, looping through, then adding the list to the dataframe.
Upvotes: 1
Views: 1168
Reputation: 862511
Use map
with df.index
:
df['days_in_month'] = df.index.map(lambda val: calendar.monthrange(val.year, val.month)[1])
Upvotes: 3
Reputation: 6639
How about getting the index column to a regular column and then using daysinmonth
:
df['days_in_month'] = df.index.daysinmonth
Upvotes: 2