Reputation: 39
I have a df that is similar to this:
import pandas as pd
import as np
df = pd.DataFrame({'district':['A','B','C','D','B','B','A'], 'price': [10.0,20.0,50.0,27.0,21.0,19.0,12.0]})
I am trying to develop a function to see which district offers the most accommodation opportunities, given a budget constraint.
If I passed the following function:
def district_recommender(df, min_budget = 17.0, max_budget = 30.0)
Once called, the output should be:
"For your specified budget {} - {}, the B District has 3 opportunities within your criteria.format(min_budget, max_budget)"
I have tried many different attempts but to not much success. This is what i've come up with.
def district_recommender(df, min_budget, max_budget):
columns = df[['district', 'price']].copy()
for i in columns.iloc[:, 0]:
if columns.iloc[:, 1] >= min_budget and columns.iloc[:, 1] <= max_budget:
return columns.iloc[:, 0].count().sum()
I think that i'm having a issue with counting the boolean values from the if-condition, as the error message i'm receiving is:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Any suggestions would be much appreciated! Thanks
Upvotes: 0
Views: 1038
Reputation: 54148
You may iterate on the different district
values, and for each check if all prices are in the range, using Series.between
def district_recommender(df, min_budget=17.0, max_budget=30.0):
for district in df['district'].unique():
values = df[df['district'] == district]['price']
if values.between(min_budget, max_budget).all():
return f"For your specified budget [{min_budget};{max_budget}], the {district} " \
f"District has {len(values)} opportunities"
Upvotes: 1
Reputation: 14847
One way to get at this would be to do the filtering for price and then just take the value_counts
of the resulting set, which will give you counts for every district.
In [350]: df[df['price'].between(17, 30)]['district'].value_counts()
Out[350]:
B 3
D 1
Name: district, dtype: int64
which you could then use to get your narrative output:
In [354]: vc = df[df['price'].between(17, 30)]['district'].value_counts()
In [355]: print(f"District {vc.idxmax()} has {vc.max()} options in your range.")
District B has 3 options in your range.
Upvotes: 2