Fedeco
Fedeco

Reputation: 876

How to get all the seasons from a list of months in pandas dataframe?

I'm analyzing a dataset where I have medical bootcamp informations.enter image description here From start date and end date(day/month/year) of this bootcamps, I've created a function that lists all the months between startdate and enddate

import datetime
from dateutil.rrule import rrule, MONTHLY

def list_months_in_date(start_date: datetime, end_date : datetime) -> list :
    strt_dt = datetime.datetime.strptime(start_date, "%d-%b-%y")
    end_dt = datetime.datetime.strptime(end_date, "%d-%b-%y")
    dates = [dt for dt in rrule(MONTHLY, dtstart=strt_dt, until=end_dt)]
    distinct_months = []
    months = [date.strftime("%B") for date in dates if date.strftime("%B") not in distinct_months]
    distinct_months = list(set(months))

    return distinct_months 

and I have my function to obtain season from a range of dates

def list_months_to_season(distinct_months : list) -> list:
    season = []
    autumn = ["September","October","November"]
    winter = ["December","January","February"]
    summer = ["June","July","August"]
    spring = ["March","April","May"]
    for month in distinct_months:
        if month in autumn :
            season.append("autuumn")
        if month in winter :
            season.append("winter")
        if month in spring :
            season.append("spring")
        if month in summer :
            season.append("summer")
    
    return season

What I need is to obtain the seasons between start date and end date (Summer, Spring,Winter, Autumn) in order to have

|id_medicalcamp|start_date|end_date  |seasons      |
|    0010      |01/06/2019|01/09/2020|summer,autumn|  

I'm running following code

df_med_camps['season_label'] = df_med_camps.apply(lambda data : list_months_in_date(data["Camp_Start_Date"],data["Camp_End_Date"]))

that gives me error KeyError: 'Camp_Start_Date'

Upvotes: 1

Views: 1033

Answers (1)

Henry Yik
Henry Yik

Reputation: 22503

First cast your dates to Datetime, then create dict of season and map by month:

df["start_date"] = pd.to_datetime(df["start_date"], format="%d/%m/%Y")
df["end_date"] = pd.to_datetime(df["end_date"], format="%d/%m/%Y")

s = {6:"Summer", 7:"Summer", 8:"Summer", 9:"Autumn", 10: "Autumn"} #...

df["label"] = df.filter(like="date").apply(lambda d: d.dt.month.map(s)).agg(", ".join, axis=1)

print (df)

   id_medicalcamp start_date   end_date           label
0              10 2019-06-01 2020-09-01  Summer, Autumn

Upvotes: 2

Related Questions