Reputation: 1983
I have a dataframe df
like this:
region | model | metrics |
---|---|---|
Tokyo | ARIMA | 0.1 |
Tokyo | FeedForward | 0.2 |
Tokyo | DeepAR | 0.3 |
Osaka | ARIMA | 0.5 |
Osaka | FeedForward | 0.2 |
Osaka | DeepAR | 0.1 |
I want to group this by region
and return the minimum value of metrics
in each group, as well as the model
value where the metrics
is minimum.
The expected result:
region | model | metrics |
---|---|---|
Tokyo | ARIMA | 0.1 |
Osaka | DeepAR | 0.1 |
I tried to do it like below, but I'm not sure how I can complete it:
df.groupby("region").agg({'metrics':'min', ####... })
Maybe use argmin
? Any help will be appreciated. Thanks!
Upvotes: 3
Views: 3158
Reputation: 18306
You can find the index of the minimum metric of each group and then loc
with it into the original frame:
# not sorting to keep the original order of appearance of regions
min_inds = df.groupby("region", sort=False).metrics.idxmin()
result = df.loc[min_inds]
to get
>>> result
region model metrics
0 Tokyo ARIMA 0.1
5 Osaka DeepAR 0.1
(may the forecasting be fun!)
Upvotes: 3
Reputation: 484
How about sort by value of metrics and drop duplicates remaining the smallest one like this.
df.sort_values("metrics").drop_duplicates(['region'], keep='first')
Upvotes: 2