some_programmer
some_programmer

Reputation: 3528

How to loop through dates column and assign values according to a certain condition?

I have a df as follows

       dates       winter   summer  rest  Final
2020-01-01 00:15:00 65.5    71.5    73.0   NaN
2020-01-01 00:30:00 62.6    69.0    70.1   NaN
2020-01-01 00:45:00 59.6    66.3    67.1   NaN
2020-01-01 01:00:00 57.0    63.5    64.5   NaN
2020-01-01 01:15:00 54.8    60.9    62.3   NaN
2020-01-01 01:30:00 53.1    58.6    60.6   NaN
2020-01-01 01:45:00 51.7    56.6    59.2   NaN
2020-01-01 02:00:00 50.5    55.1    57.9   NaN
2020-01-01 02:15:00 49.4    54.2    56.7   NaN
2020-01-01 02:30:00 48.5    53.7    55.6   NaN
2020-01-01 02:45:00 47.9    53.4    54.7   NaN
2020-01-01 03:00:00 47.7    53.3    54.2   NaN
2020-01-01 03:15:00 47.9    53.1    54.1   NaN
2020-01-01 03:30:00 48.7    53.2    54.6   NaN
2020-01-01 03:45:00 50.2    54.1    55.8   NaN
2020-01-01 04:00:00 52.3    56.1    57.9   NaN
2020-04-28 12:30:00 225.1   200.0   209.8  NaN
2020-04-28 12:45:00 215.7   193.8   201.9  NaN
2020-04-28 13:00:00 205.6   186.9   193.4  NaN
2020-04-28 13:15:00 195.7   179.9   185.0  NaN
2020-04-28 13:30:00 186.7   173.4   177.4  NaN
2020-04-28 13:45:00 179.2   168.1   170.9  NaN
2020-04-28 14:00:00 173.8   164.4   166.3  NaN
2020-04-28 14:15:00 171.0   163.0   163.9  NaN
2020-04-28 14:30:00 170.7   163.5   163.6  NaN
2020-12-31 21:15:00 88.5    90.2    89.2   NaN
2020-12-31 21:30:00 85.2    88.5    87.2   NaN
2020-12-31 21:45:00 82.1    86.3    85.0   NaN
2020-12-31 22:00:00 79.4    84.1    83.2   NaN
2020-12-31 22:15:00 77.6    82.4    82.1   NaN
2020-12-31 22:30:00 76.4    81.2    81.7   NaN
2020-12-31 22:45:00 75.6    80.3    81.6   NaN
2020-12-31 23:00:00 74.7    79.4    81.3   NaN
2020-12-31 23:15:00 73.7    78.4    80.6   NaN
2020-12-31 23:30:00 72.3    77.2    79.5   NaN
2020-12-31 23:45:00 70.5    75.7    77.9   NaN
2021-01-01 00:00:00 68.2    73.8    75.7   NaN

The dates column has dates starting from 2020-01-01 00:15:00 till 2021-01-01 00:00:00 split at every 15 mins.

I also have the following date range conditions:

Winter: 01.11 - 20.03

Summer: 15.05 - 14.09

Rest: 21.03 - 14.05 & 15.09 - 31.10

What I want to do is to create a new column named season that checks every date in the dates column and assigns winter if the date is in Winter range, summer if it is in Summer range and rest if it is the Rest range.

Then, based on the value in the season column, the Final column must be filled. If the value in season column is 'winter', then the values from winter column must be placed, if the value in season column is 'summer', then the values from summer column must be placed and so on.

How can this be done?

Upvotes: 3

Views: 250

Answers (1)

jezrael
jezrael

Reputation: 862441

Idea is normalize datetimes for same year, then filter by Series.between and set new column by numpy.select:

d = pd.to_datetime(df['dates'].dt.strftime('%m-%d-2020'))

m1 = d.between('2020-11-01','2020-12-31') | d.between('2020-01-01','2020-03-20')
m2 = d.between('2020-05-15','2020-09-14') 

df['Final'] = np.select([m1, m2], ['Winter','Summer'], default='Rest')
print (df)
                dates  winter  summer  rest   Final
0 2020-01-01 00:15:00    65.5    71.5  73.0  Winter
1 2020-06-15 00:30:00    62.6    69.0  70.1  Summer
2 2020-12-25 00:45:00    59.6    66.3  67.1  Winter
3 2020-10-10 01:00:00    57.0    63.5  64.5    Rest

Upvotes: 3

Related Questions