Jespar
Jespar

Reputation: 1026

Python: Datetime to season

I want to convert a date time series to season, for example for months 3, 4, 5 I want to replace them with 2 (spring); for months 6, 7, 8 I want to replace them with 3 (summer) etc.

So, I have this series

id
1       2011-08-20
2       2011-08-23
3       2011-08-27
4       2011-09-01
5       2011-09-05
6       2011-09-06
7       2011-09-08
8       2011-09-09
Name: timestamp, dtype: datetime64[ns]

and this is the code I have been trying to use, but to no avail.

# Get seasons
spring = range(3, 5)
summer = range(6, 8)
fall = range(9, 11)
# winter = everything else

month = temp2.dt.month
season=[]

for _ in range(len(month)):
    if any(x == spring for x in month):
       season.append(2) # spring 
    elif any(x == summer for x in month):
        season.append(3) # summer
    elif any(x == fall for x in month):
        season.append(4) # fall
    else:
        season.append(1) # winter

and

for _ in range(len(month)):
    if month[_] == 3 or month[_] == 4 or month[_] == 5:
        season.append(2) # spring 
    elif month[_] == 6 or month[_] == 7 or month[_] == 8:
        season.append(3) # summer
    elif month[_] == 9 or month[_] == 10 or month[_] == 11:
        season.append(4) # fall
    else:
        season.append(1) # winter

Neither solution works, specifically in the first implementation I receive an error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

While in the second is a large list with errors. Any ideas please? Thanks

Upvotes: 24

Views: 37631

Answers (7)

kho
kho

Reputation: 1291

A bit more generic method which works for ndarray, list or pandas series.

df_date = pd.DataFrame(pd.date_range('2024-01-01', '2025-01-01', freq='1D', normalize=True), columns=['date'])

day_shift = 14
df_date['season_int'] = get_season(df_date['date'], day_shift=day_shift)
df_date['season'] = get_season(df_date['date'], day_shift=day_shift, season_names=True)
df_date[df_date['season_int'].diff().abs() > 0]
# date       season_int season
# 2024-03-15          1 spring
# 2024-06-15          2 summer
# 2024-09-15          3 fall
# 2024-12-15          0 winter

# or a list
get_season(['2024-01-01 09:04:00', '2024-03-01 09:04:00'], season_names= True)
# array(['winter', 'spring'], dtype='<U6')

# and with mapped season names
get_season(['2024-01-01', '2024-03-01', '2024-06-01', '2024-09-01'], season_names= {0: 'cold', 1: 'flowering', 2: 'hot', 3: 'harvest'} )
# rray(['cold', 'flowering', 'hot', 'harvest'], dtype='<U9')

Function definition

import numpy as np
import pandas as pd

def get_season(dates, month_shift = 0, day_shift = 0, season_names = None):
    """
    Get the season of a given date.
    
    Parameters
    ----------
    dates : array-like of datetime64
        The dates for which to calculate the season.
    month_shift : int, default 0
        Number of months to shift the calculation. 
        - for winter [Dec, Jan, Feb] -> month_shift=0 (default) 
        - for winter [Jan, Feb, Mar] -> month_shift=1
    day_shift : int, default 0
        Number of days to shift the calculation. The season starts:
         - at the first day of the month -> day_shift=0 (default) 
         - at the second day of the month -> day_shift=1
    season_names : dict or None or bool, default None
        To optinaly map the seasons
        None: no mapping
        True: default mapping with dict {0: 'winter', 1: 'spring', 2: 'summer', 3: 'fall'} 
        dict: any other mapping. Keys have to be in range(4).

    Returns
    -------
    array of int or str
        The season of each input date.

    Examples
    --------
    day_shift = 14
    df_date = pd.DataFrame(pd.date_range('2024-01-01', '2025-01-01', freq='1D', normalize=True), columns=['date'])
    df_date['season_int'] = get_season(df_date['date'], day_shift=day_shift)
    df_date['season'] = get_season(df_date['date'], day_shift=day_shift, season_names=True)
    df_date[df_date['season_int'].diff().abs() > 0]
    # date       season_int season
    # 2024-03-15          1 spring
    # 2024-06-15          2 summer
    # 2024-09-15          3 fall
    # 2024-12-15          0 winter
    """
    if isinstance(dates, pd.Series):
        dates = dates.values
    if isinstance(dates, list):
        dates = np.array(dates).astype('datetime64')
    dates = dates + np.timedelta64(-day_shift, 'D')
    season = (dates.astype('datetime64[M]').astype(int) - month_shift) % 12 / 3
    season = np.round(season, decimals=0).astype(int) % 4
    # season = season.astype(int)

    if season_names is not None:
        if season_names is True:
            season_names = {0: 'winter', 1: 'spring', 2: 'summer', 3: 'fall'}
        return np.vectorize(season_names.get)(season)
    
    return season

Upvotes: 0

VoV
VoV

Reputation: 1

Here is my solution (not the best solution for leap years) if you want to convert date to season if you take in mind month and day in the month. I took arbitrary non-leap year:

import pandas as pd
df = pd.DataFrame({'Date': pd.date_range('2022-01-01', '2023-01-01', periods=12)})

winter_start = pd.to_datetime("2022-12-21", format = "%Y-%m-%d").dayofyear
spring_start = pd.to_datetime("2022-3-21", format = "%Y-%m-%d").dayofyear
summer_start = pd.to_datetime("2022-6-21", format = "%Y-%m-%d").dayofyear
autumn_start = pd.to_datetime("2022-9-23", format = "%Y-%m-%d").dayofyear

for index, date in df["Date"].items():
    if (date.dayofyear >= winter_start) or (date.dayofyear < spring_start):
        df.at[index, "Season"] = "Winter"
    elif (date.dayofyear >= spring_start) and (date.dayofyear < summer_start):
        df.at[index, "Season"] = "Spring"
    elif (date.dayofyear >= summer_start) and (date.dayofyear < autumn_start):
        df.at[index, "Season"] = "Summer"
    else:
        df.at[index, "Season"] = "Autumn"

    Out:
    Date                            Season
0   2022-01-01 00:00:00.000000000   Winter
1   2022-02-03 04:21:49.090909091   Winter
2   2022-03-08 08:43:38.181818182   Winter
3   2022-04-10 13:05:27.272727273   Spring
4   2022-05-13 17:27:16.363636364   Spring
5   2022-06-15 21:49:05.454545456   Spring
6   2022-07-19 02:10:54.545454546   Summer
7   2022-08-21 06:32:43.636363636   Summer
8   2022-09-23 10:54:32.727272728   Autumn
9   2022-10-26 15:16:21.818181820   Autumn
10  2022-11-28 19:38:10.909090912   Autumn
11  2023-01-01 00:00:00.000000000   Winter

Upvotes: 0

KuLeMi
KuLeMi

Reputation: 406

import pandas as pd
import datetime as dt

df = pd.DataFrame({'date': pd.date_range('2000-01-01', '2001-01-01', periods=12)})
seasons = {(1, 12, 2): 1, (3, 4, 5): 2, (6, 7, 8): 3, (9, 10, 11): 4}
df['m'] = df.date.dt.month

def season(ser):
    for k in seasons.keys():
        if ser in k:
            return seasons[k]

df['s'] = df.m.apply(seasons)
Out[25]: 
                            date   m  s
0  2000-01-01 00:00:00.000000000   1  1
1  2000-02-03 06:32:43.636363636   2  1
2  2000-03-07 13:05:27.272727273   3  2
3  2000-04-09 19:38:10.909090910   4  2
4  2000-05-13 02:10:54.545454546   5  2
5  2000-06-15 08:43:38.181818182   6  3
6  2000-07-18 15:16:21.818181820   7  3
7  2000-08-20 21:49:05.454545456   8  3
8  2000-09-23 04:21:49.090909092   9  4
9  2000-10-26 10:54:32.727272728  10  4
10 2000-11-28 17:27:16.363636364  11  4
11 2001-01-01 00:00:00.000000000   1  1

Upvotes: 2

Lucas Machado Moschen
Lucas Machado Moschen

Reputation: 59

I think a more precise solution may be useful. If we have a month (1, ..., 12), we can convert it to season decreasing one and dividing by 3,

df = pd.Series(["2011-06-07", 
                "2011-08-23", 
                "2011-08-27", 
                "2011-09-01", 
                "2011-09-05", 
                "2011-09-06", 
                "2011-09-08", 
                "2011-12-25"])
 df = pd.to_datetime(df)

 season = (df.dt.month - 1) // 3

Therefore we will be mapping 1,2,3 to 0 (winter), 4,5,6 to 1 (spring), 7,8,9 to 2 (summer), and 10,11,12 to 3 (fall). However, we know the months 3,6,9, and 12 divide two seasons each. I propose the following approach:

If the month is 3 and the day is greater or equal 20, the season is spring, and we need to sum 1. If the month is 6 and the day is greater or equal 21, the season is summer, and we need to sum 1. If the month is 9 and the day is greater or equal 23, the season is fall, and we need to sum 1. If the month is 3 and the day is greater or equal 20, the season is winter, and we need to decrease 3 (or sum +1 in modulus 4). Then we have

season += (df.dt.month == 3)&(df.dt.day>=20)
season += (df.dt.month == 6)&(df.dt.day>=21)
season += (df.dt.month == 9)&(df.dt.day>=23)
season -= 3*((df.dt.month == 12)&(df.dt.day>=21)).astype(int)

The solution for this series will be [1,2,2,2,2,2,2,0].

Upvotes: 4

AChampion
AChampion

Reputation: 30268

You can use a simple mathematical formula to compress a month to a season, e.g.:

>>> [month%12 // 3 + 1 for month in range(1, 13)]
[1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 1]

So for your use-case using vector operations (credit @DSM):

>>> temp2.dt.month%12 // 3 + 1
1    3
2    3
3    3
4    4
5    4
6    4
7    4
8    4
Name: id, dtype: int64

Upvotes: 51

Mohamed Ali JAMAOUI
Mohamed Ali JAMAOUI

Reputation: 14689

It's, also, possible to use dictionary mapping.

  1. Create a dictionary that maps a month to a season:

    In [27]: seasons = [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 1]
    
    In [28]: month_to_season = dict(zip(range(1,13), seasons))
    
    In [29]: month_to_season 
    Out[29]: {1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 3, 7: 3, 8: 3, 9: 4, 10: 4, 11: 4, 12: 1}
    
  2. Use it to convert the months to seasons

    In [30]: df.id.dt.month.map(month_to_season) 
    Out[30]: 
    1    3
    2    3
    3    3
    4    4
    5    4
    6    4
    7    4
    8    4
    Name: id, dtype: int64
    

Performance: This is fairly fast

In [35]: %timeit df.id.dt.month.map(month_to_season) 
1000 loops, best of 3: 422 µs per loop

Upvotes: 8

John Hao
John Hao

Reputation: 39

I think this would work.

while True:
date=int(input("Date?"))
season=""
if date<4:
    season=1
elif date<7:
    season=2
elif date<10:
    season=3
elif date<13:
    season=4
else:
    print("This would not work.")
print(season)

Upvotes: 2

Related Questions