DeepMeh
DeepMeh

Reputation: 23

How do I get the maximum and minimum values of a column depending on another two columns in pandas dataframe?

This is my first time asking a question. I have a dataframe that looks like below:

import pandas as pd
data = [['AK', 'Co',2957],
    ['AK', 'Ot', 15],
    ['AK','Petr', 86848],
    ['AL', 'Co',167],
    ['AL', 'Ot', 10592],
    ['AL',  'Petr',1667]]
my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

I need to find the maximum and minimum values of the third column based on the first two columns. I did browse through a few stackoverflow questions but couldn't find the right way to solve this. My output should look like below:

data = [['AK','Ot', 15],
['AK','Petr',86848],
['AL','Co',167],
['AL','Ot', 10592]]

my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

Note: Please let me know where I am lagging before leaving a negative marking on the question

This link helped me: Python pandas dataframe: find max for each unique values of an another column

Upvotes: 0

Views: 155

Answers (1)

Umar.H
Umar.H

Reputation: 23099

try idxmin and idxmax with .loc filter.

    new_df = my_df.loc[
        my_df.groupby(["State"])
        .agg(ElecMin=("Elec", "idxmin"), ElecMax=("Elec", "idxmax"))
        .stack()
    ]
)

print(new_df)

  State Energy   Elec
0    AK     Ot     15
1    AK   Petr  86848
2    AL     Co    167
3    AL     Ot  10592

Upvotes: 1

Related Questions