Alex Borowiak
Alex Borowiak

Reputation: 73

Maximum Monthly Values whilst retaining the Data at which that values occured

I have daily rainfall data that looks like the following:

   Date           Rainfall (mm)
1922-01-01  0.0
1922-01-02  0.0
1922-01-03  0.0
1922-01-04  0.0
1922-01-05  31.5
1922-01-06  0.0
1922-01-07  0.0
1922-01-08  0.0
1922-01-09  0.0
1922-01-10  0.0
1922-01-11  0.0
1922-01-12  9.1
1922-01-13  6.4
.
.
.

I am trying to work out the maximum value for each month for each year, and also what date the maximum value occurred on. I have been using the code:

rain_data.groupby(pd.Grouper(freq = 'M'))['Rainfall (mm)'].max()

This is returning the correct maximum value but returns the end date of each month rather than the date that maximum event occurred on.

1974-11-30   0.0
1974-12-31   0.0
1975-01-31   0.0
1975-02-28   65.0
1975-03-31   129.5
1975-11-30   59.9
1975-12-31   7.1
1976-01-31   10.0
1976-11-30   0.0
1976-12-31   0.0
1977-01-31   4.3

Any suggestions on how I could get the correct date?

Upvotes: 0

Views: 143

Answers (1)

Amethyst
Amethyst

Reputation: 53

I'm new to this, but what I think you're doing in (pd.Grouper(freq = 'M')) is grouping all the values in each month, but it's assigning every value within a group to the same date. I think this is why your groupby isn't returning the dates you're looking for.

I think your question is answered here. Alexander suggests to use:

df.groupby(pd.TimeGrouper('M')).Close.agg({'max date': 'idxmax', 'max rainfall': np.max})

The agg works without the Close I think, so if it's problematic (as I found) you might want to take it out.

Upvotes: 1

Related Questions