Reputation: 137
These are the attributes of my data set.
My aim is to compute the average zip-code price for apartments in Paris (20 districts in total, column name is "Zipcode"). Because the original data set didn't have an avg_zip_price_app column, I had to create it.
def get_avg_zip_appartment_price(df, zip):
price = 0
if np.where(df["Zipcode"] == zip): # this row's zipcode
price = 12811
elif np.where(df["Zipcode"] == zip):
price = 11623
elif np.where(df["Zipcode"] == zip):
price = 12345
elif np.where(df["Zipcode"] == zip):
price = 13197
elif np.where(df["Zipcode"] == zip):
price = 12335
elif np.where(df["Zipcode"] == zip):
price = 14420
elif np.where(df["Zipcode"] == zip):
price = 13899
elif np.where(df["Zipcode"] == zip):
price = 11673
elif np.where(df["Zipcode"] == zip):
price = 10932
elif np.where(df["Zipcode"] == zip):
price = 10301
elif np.where(df["Zipcode"] == zip):
price = 9244
elif np.where(df["Zipcode"] == zip):
price = 9146
elif np.where(df["Zipcode"] == zip):
price = 10032
elif np.where(df["Zipcode"] == zip):
price = 9951
elif np.where(df["Zipcode"] == zip):
price = 9350
elif np.where(df["Zipcode"] == zip):
price = 11079
elif np.where(df["Zipcode"] == zip):
price = 10687
elif np.where(df["Zipcode"] == zip):
price = 9664
elif np.where(df["Zipcode"] == zip):
price = 8385
elif np.where(df["Zipcode"] == zip):
price = 8744
return price
conditions = [
(df['Zipcode'] == 75001),
(df['Zipcode'] == 75002),
(df['Zipcode'] == 75003),
(df['Zipcode'] == 75004),
(df['Zipcode'] == 75005),
(df['Zipcode'] == 75006),
(df['Zipcode'] == 75007),
(df['Zipcode'] == 75008),
(df['Zipcode'] == 75009),
(df['Zipcode'] == 75010),
(df['Zipcode'] == 75011),
(df['Zipcode'] == 75012),
(df['Zipcode'] == 75013),
(df['Zipcode'] == 75014),
(df['Zipcode'] == 75015),
(df['Zipcode'] == 75016),
(df['Zipcode'] == 75017),
(df['Zipcode'] == 75018),
(df['Zipcode'] == 75019),
(df['Zipcode'] == 75020)
]
choices = [
get_avg_zip_appartment_price(user_df, 75001), get_avg_zip_appartment_price(user_df, 75002),get_avg_zip_appartment_price(user_df, 75003),
get_avg_zip_appartment_price(user_df, 75004), get_avg_zip_appartment_price(user_df, 75005),get_avg_zip_appartment_price(user_df, 75006),
get_avg_zip_appartment_price(user_df, 75007),get_avg_zip_appartment_price(user_df, 75008),get_avg_zip_appartment_price(user_df, 75009),
get_avg_zip_appartment_price(user_df, 75010),get_avg_zip_appartment_price(user_df, 75011),get_avg_zip_appartment_price(user_df, 75012),
get_avg_zip_appartment_price(user_df, 75013),get_avg_zip_appartment_price(user_df, 75014),get_avg_zip_appartment_price(user_df, 75015),
get_avg_zip_appartment_price(user_df, 75016),get_avg_zip_appartment_price(user_df, 75017),get_avg_zip_appartment_price(user_df, 75018),
get_avg_zip_appartment_price(user_df, 75019),get_avg_zip_appartment_price(user_df, 75020)]
user_df['avg_zip_price_app'] = np.select(conditions, choices)
print(user_df.head())
But I always get the same value for each observation. Is it because the syntax in my get_avg_zip_appartment_price(df, zip) method for the row's condition is incorrect and therefore every time the method is called, it checks for the first row and it's true, so the price value is always the same for all rows? This is the result I get:
Upvotes: 0
Views: 78
Reputation: 242
The mistake in your code :
np.where(df["Zipcode"] == zip) #This will return true whenever there is a zip entry in df.
If zip = -1
, then get_avg_zip_appartment_price(df, zip)
will return 0
, since it will not match with any record in df.
You can use a dictionary key-value pairs to give prices to zip codes.
Upvotes: 2