A2N15
A2N15

Reputation: 605

Get Number of Holidays per month in Python

Hello world,

I would like to retrieve for each month the number of public holiday.

Here is my dataset

City       date        value   End_date
BE         01/01/16    41       31/01/16
NW         01/10/16    74       31/10/16
BY         01/05/16    97       31/05/16

With the following Code, I am able to know if the day is a public holiday manually:

from datetime import date
import holidays

#prov = BW, BY, BE, BB, HB, HH, HE, MV, NI, NW, RP, SL, SN, ST, SH, TH

us_holidays = holidays.CountryHoliday('DE', prov='NW', state=None )

date(2020, 5, 21) in us_holidays

out:
False

The Questions: How can I count for each month Number of 'True' values? How can I store the count of 'True' values within the dataframe?

Expected output

City       date        value   End_date    Nb_pub_holiday
BE         01/01/16    41       31/01/16        2
NW         01/10/16    74       31/10/16        0
BY         01/05/16    97       31/05/16        4

Upvotes: 3

Views: 1248

Answers (1)

jezrael
jezrael

Reputation: 862511

Not sure why, but I get different output in custom function with date_range and count matched values by sum in generator:

#convert columns to datetimes
df['date'] = pd.to_datetime(df['date'], format='%d/%m/%y')
df['End_date'] = pd.to_datetime(df['End_date'], format='%d/%m/%y')

import holidays

def f1(x):
    h = holidays.CountryHoliday('DE', prov=x['City'], state=None)
    d = pd.date_range(x['date'], x['End_date'])
    return sum(y in h for y in d)

df['Nb_pub_holiday'] = df.apply(f1, axis=1)
print (df)
  City       date  value   End_date  Nb_pub_holiday
0   BE 2016-01-01     41 2016-01-31               1
1   NW 2016-10-01     74 2016-10-31               1
2   BY 2016-05-01     97 2016-05-31               4

For list of dates of holidays is possible use:

def f2(x):
    h = holidays.CountryHoliday('DE', prov=x['City'], state=None)
    d = pd.date_range(x['date'], x['End_date'])
    return [y.date() for y in d if y in h]

df['Lst_pub_holiday'] = df.apply(f2, axis=1)
print (df)
  City       date  value   End_date  \
0   BE 2016-01-01     41 2016-01-31   
1   NW 2016-10-01     74 2016-10-31   
2   BY 2016-05-01     97 2016-05-31   

                                    Lst_pub_holiday  
0                                      [2016-01-01]  
1                                      [2016-10-03]  
2  [2016-05-01, 2016-05-05, 2016-05-16, 2016-05-26]

Upvotes: 5

Related Questions