Gambit1337
Gambit1337

Reputation: 5

Is there a way to iterate through a pandas datetime series performing a function?

I have created a custom function that describes seasonality and want to add a new column to a dataframe by applying that function to a series of datetime objects in a pandas dataframe. I'm attempting to create a list that contains the values of the date_season function applied to the dates in the dataframe.

All the variables in the date_season function below are of the type datetime.date, except for 'dif' which is a datetime.timedelta.

Here is the function:

import datetime as dt
import pandas as pd

def date_season(date):
    year = date.year
    min_season = dt.date(year,1,1)
    max_season = dt.date(year,6,30)
    dif = abs(max_season - date)
    dif_days = dif.days
    x = (((max_season - min_season).days) - dif.days * 2) / (max_season - min_season).days
    seasonality = np.sin(x * (np.pi) / 2)
    return(seasonality)

And here is how the pandas dataframe is created:

start = dt.date(2017,1,1)
end = dt.date(2019,12,31)
df = pd.DataFrame({'Date': pd.date_range(start, end, freq="D")})

Attempting to create a new list with the seasonality parameter:

z = []
for index, row in df.iterrows():
    z.append(date_season(row.Date))

This returns the error message:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-105-63e9cb35ed55> in <module>()
      1 z = []
      2 for index, row in df.iterrows():
----> 3     z.append(date_season(row.Date))

<ipython-input-71-5e2b35e24e38> in date_season(date)
      3     min_season = dt.date(year,1,1)
      4     max_season = dt.date(year,6,30)
----> 5     dif = abs(max_season - date)
      6     dif_days = dif.days
      7     x = (((max_season - min_season).days) - dif.days * 2) / (max_season - min_season).days

pandas\_libs\tslibs\timestamps.pyx in 
pandas._libs.tslibs.timestamps._Timestamp.__sub__()

TypeError: descriptor '__sub__' requires a 'datetime.datetime' object but received a 'datetime.date'

Attempting:

new_df = df.apply(lambda x: date_season(x))

returns

AttributeError: ("'Series' object has no attribute 'year'", 'occurred at index Date')

Not sure why it requires a datetime.datetime object, because the function works with single inputs in the datetime.date format. Is there a simpler way to iterate through the dates and create a new column with the results of this function?

Upvotes: 0

Views: 985

Answers (1)

iamchoosinganame
iamchoosinganame

Reputation: 1120

You need to define the min_season and max_season as pandas datetime objects instead of the built-in python datetime class. It's confusing but they are not completely interchangeable.

def date_season(date):
    year = date.year
    #use pandas.datetime
    min_season = pd.datetime(year,1,1)
    max_season = pd.datetime(year,6,30)
    dif = abs(max_season - date)
    dif_days = dif.days
    x = (((max_season - min_season).days) - dif.days * 2) / (max_season - min_season).days
    seasonality = np.sin(x * (np.pi) / 2)
    return(seasonality)

Now you can use either applymap for your whole dataframe or you can use apply on a single column.

new_df = df.applymap(date_season)

or

df['Date'].apply(date_season)

Upvotes: 1

Related Questions