Reputation: 61
I have a dataframe
with a datetime
column in it, like 2014-01-01
, 2016-06-05
, etc. Now I want to add a column in the dataframe
calculating the day of year (for that given year).
On this forum I did find some hints for sure, but I'm struggling with the types and dataframe
stuff.
So this works fine
from datetime import datetime
day_to_calc = today
day_of_year = day_to_calc.timetuple().tm_yday
day_of_year
But my day_to_calc
is not today, but df['Date']
. However, if I try this
df['DOY'] = df['Date'].timetuple().tm_yday
I get
AttributeError: 'Series' object has no attribute 'timetuple'
Ok, so I guess I need a map function perhaps? So I'm trying something like ..
df['DOY'] = map (datetime.timetuple().tm_yday,df['Date'])
And surely you guys see how stupid that is ;-) (but I'm still learning Python)
TypeError: descriptor 'timetuple' of 'datetime.datetime' object needs an argument
So that makes sense sort of because I need to pass the date as parameter, sooo .. trying
df['DOY'] = datetime.timetuple(df['Date']).tm_yday
TypeError: descriptor 'timetuple' requires a 'datetime.datetime' object but received a 'Series'
There must be a simple way, but I just can't figure out the syntax :-(
Upvotes: 6
Views: 4970
Reputation: 655
I noticed the above answer does not go into great detail, so I've provided a more explanatory answer below.
Try the following:
import pandas as pd
# Create a pandas datetime range for the year 2022
passed_2022 = pd.date_range('2022-01-01', '2022-12-31')
# Convert the datetime range to a list of strings in the format 'YYYY-MM-DD'
passed_2022_list = [i.strftime('%Y-%m-%d') for i in passed_2022]
# Create a DataFrame
data = pd.DataFrame({'datetime': passed_2022_list})
# Filter the data DataFrame to only include dates in the passed_2022 list
data = data[data['datetime'].isin(passed_2022_list)]
# Count the number of rows in the filtered DataFrame
num_days_passed = len(data)
# Create a new DataFrame with 'datetime' and 'DAYS_OF_YEAR' columns
result = pd.DataFrame({'datetime': passed_2022_list,
'DAYS OF YEAR': range(1, num_days_passed+1)})
# Print the result of the DataFrame
print(result)
Output:
datetime DAYS OF YEAR
0 2022-01-01 1
1 2022-01-02 2
2 2022-01-03 3
3 2022-01-04 4
4 2022-01-05 5
.. ... ...
360 2022-12-27 361
361 2022-12-28 362
362 2022-12-29 363
363 2022-12-30 364
364 2022-12-31 365
[365 rows x 2 columns]
Process finished with exit code 0
Upvotes: 0
Reputation: 36635
Use dayofyear
function:
import pandas as pd
# first convert date string to datetime with a proper format string
df = pd.DataFrame({'Date':pd.to_datetime(['2014-01-01', '2016-06-05'], format='%Y-%m-%d')})
# calculate day of year
df['DOY'] = df['Date'].dt.dayofyear
print(df)
Output:
Date DOY
0 2014-01-01 1
1 2016-06-05 157
Upvotes: 7