Pam Koertshuis
Pam Koertshuis

Reputation: 61

Calculate day of year in dataframe from a datetime column to another column in Python

I have a dataframe with a datetime column in it, like 2014-01-01, 2016-06-05, etc. Now I want to add a column in the dataframe calculating the day of year (for that given year).

On this forum I did find some hints for sure, but I'm struggling with the types and dataframe stuff. So this works fine

from datetime import datetime

day_to_calc = today

day_of_year = day_to_calc.timetuple().tm_yday

day_of_year

But my day_to_calc is not today, but df['Date']. However, if I try this

df['DOY'] = df['Date'].timetuple().tm_yday

I get

AttributeError: 'Series' object has no attribute 'timetuple'

Ok, so I guess I need a map function perhaps? So I'm trying something like ..

df['DOY'] = map (datetime.timetuple().tm_yday,df['Date'])

And surely you guys see how stupid that is ;-) (but I'm still learning Python)

TypeError: descriptor 'timetuple' of 'datetime.datetime' object needs an argument

So that makes sense sort of because I need to pass the date as parameter, sooo .. trying

df['DOY'] = datetime.timetuple(df['Date']).tm_yday 

TypeError: descriptor 'timetuple' requires a 'datetime.datetime' object but received a 'Series'

There must be a simple way, but I just can't figure out the syntax :-(

Upvotes: 6

Views: 4970

Answers (2)

Hosea
Hosea

Reputation: 655

I noticed the above answer does not go into great detail, so I've provided a more explanatory answer below.

Try the following:

import pandas as pd

# Create a pandas datetime range for the year 2022
passed_2022 = pd.date_range('2022-01-01', '2022-12-31')

# Convert the datetime range to a list of strings in the format 'YYYY-MM-DD'
passed_2022_list = [i.strftime('%Y-%m-%d') for i in passed_2022]

# Create a DataFrame
data = pd.DataFrame({'datetime': passed_2022_list})

# Filter the data DataFrame to only include dates in the passed_2022 list
data = data[data['datetime'].isin(passed_2022_list)]

# Count the number of rows in the filtered DataFrame
num_days_passed = len(data)

# Create a new DataFrame with 'datetime' and 'DAYS_OF_YEAR' columns
result = pd.DataFrame({'datetime': passed_2022_list,
                       'DAYS OF YEAR': range(1, num_days_passed+1)})

# Print the result of the DataFrame
print(result)

Output:

      datetime        DAYS OF YEAR
0    2022-01-01            1
1    2022-01-02            2
2    2022-01-03            3
3    2022-01-04            4
4    2022-01-05            5
..      ...               ...
360  2022-12-27           361
361  2022-12-28           362
362  2022-12-29           363
363  2022-12-30           364
364  2022-12-31           365

[365 rows x 2 columns]

Process finished with exit code 0

Upvotes: 0

Serenity
Serenity

Reputation: 36635

Use dayofyear function:

import pandas as pd
# first convert date string to datetime with a proper format string
df = pd.DataFrame({'Date':pd.to_datetime(['2014-01-01', '2016-06-05'], format='%Y-%m-%d')})
# calculate day of year
df['DOY'] = df['Date'].dt.dayofyear
print(df)

Output:

        Date  DOY
0 2014-01-01    1
1 2016-06-05  157

Upvotes: 7

Related Questions