Reputation: 475
I have an example data as :
datetime. column1
2021-04-01 01:00. 11
2021-04-05 02:00. 10
2021-04-12 03:00. 1
2021-04-11 04:00. 5
2021-04-07 05:00. 20
I would like to create a new column called season which outputs cooling season if the datetime values is between the month of May and September, while heating season if the datetime values are between october and April. The example output should look like this:
datetime. column1. seasons
2021-01-01 01:00. 11. Heating season
2021-05-01 02:00. 10. Cooling season
2021-12-01 03:00. 1. Heating season
2021-11-01 04:00. 5. Heating season
2021-07-01 05:00. 20. Cooling season
Upvotes: 1
Views: 699
Reputation: 5918
There are many ways,
df['datetime'] = pd.to_datetime(df.datetime)
With np.select with specific conditions for months
conditions = [
df.datetime.dt.month.isin(np.arange(5,10)),
(df.datetime.dt.month.isin(np.arange(1,5))) | (df.datetime.dt.month.isin(np.arange(10,13))),
]
choices = ['Cooling season','Heating season']
df['seasons'] = np.select(conditions, choices)
df
With np.select with a default
conditions = [
df.datetime.dt.month.isin(np.arange(5,10)),
]
choices = ['Cooling season']
df['seasons'] = np.select(conditions, choices, default='Heating season')
df
With np.where
df['seasons'] = np.where(df.datetime.dt.month.isin(np.arange(5,10)), 'Cooling season','Heating season')
df
Output
datetime column1 seasons
0 2021-01-01 01:00:00 11 Heating season
1 2021-05-01 02:00:00 10 Cooling season
2 2021-12-01 03:00:00 1 Heating season
3 2021-11-01 04:00:00 5 Heating season
4 2021-07-01 05:00:00 20 Cooling season
Upvotes: 3
Reputation: 120439
df["seasons"] = np.where((df["datetime"].dt.month >= 5)
& (df["datetime"].dt.month <= 9),
"Cooling season", "Heating season")
>>> df
datetime column1 seasons
0 2021-01-01 01:00:00 11 Heating season
1 2021-05-05 02:00:00 10 Cooling season
2 2021-12-12 03:00:00 1 Heating season
3 2021-11-11 04:00:00 5 Heating season
4 2021-07-07 05:00:00 20 Cooling season
Upvotes: 1