BPDESILVA
BPDESILVA

Reputation: 2198

Selecting Values from a For Loop Dataframe Python

How do I get the row of the below csv's highest price from the result retrieved in the for loop any references or solutions ?

import pandas as pd
import numpy as np
df = pd.read_csv('/Users/caleb/ic/US_FINAL.csv',names=['ProductName', 'Year', 'Production', 'Price'])
df.set_index("ProductName",inplace=True)
products=['FortuneCookie']
for product in products:
     print(df.loc[product])

Upvotes: 0

Views: 155

Answers (3)

ALollz
ALollz

Reputation: 59549

You were close. But df.loc[product] was not doing what you wanted it to. You want to find where df['ProductName'] == product, so the subsets are given by df.loc[df['ProductName']==product]. The rest just determines the row where the price is the max.

for product in products:
    print(df.loc[df.Price == df[df.ProductName == product]['Price'].max()])

Here's a full example if you want to play around with it. If you want to generate this for every Product in your dataframe, you may consider using df.ProductName.unique() instead of explicitly defining the list, since you won't have to specify every product individually.

import pandas as pd
df = pd.DataFrame({'ProductName': ['Sandwich', 'FortuneCookie', 'FortuneCookie', 'Sandwich'],
             'Year': [2010, 2010, 2011, 2009],
             'Production': [17, 18, 101, 17],
             'Price': [21.11, 17.82, 19.99, 20.13]})

products = df.ProductName.unique()
for product in products:
    print(df.loc[df.Price == df[df.ProductName == product]['Price'].max()])

   Price ProductName  Production  Year
0  21.11    Sandwich          17  2010
   Price    ProductName  Production  Year
2  19.99  FortuneCookie         101  2011

Upvotes: 2

YOLO
YOLO

Reputation: 21719

I think a simple groupby function can get you:

# using data from @ALLoLz answer
df = pd.DataFrame({'ProductName': ['Sandwich', 'FortuneCookie', 'FortuneCookie', 'Sandwich'],
             'Year': [2010, 2010, 2011, 2009],
             'Production': [17, 18, 101, 17],
             'Price': [21.11, 17.82, 19.99, 20.13]})

df.groupby('ProductName')['Price'].max()

# alternative
# df.groupby('ProductName').agg({'Price':'max'})

Upvotes: 1

Johny Mudly
Johny Mudly

Reputation: 324

I think the df.idxmax() function would your best bet

So for you it would be

for product in products:
     temp = df.loc[product]
     print(temp.loc[temp['Price'].idxmax()])

Upvotes: 1

Related Questions