Reputation: 1
I'm building a restaurant recommendation system in Python using user preferences (cuisine, price range) and cosine similarity. I'm encountering a ValueError: User preference vector and restaurant feature vectors must have the same number of dimensions. Error under two scenarios:
Scenario 1:
I'm using one-hot encoding for cuisine types in the restaurant data. When building the user preference vector, the get_user_preferences function considers both cuisine and price range (if provided). The error occurs when the user selects "all" cuisines (one-hot encoded vector with all features as 1). Scenario 2:
The error occurs even when the user selects a specific cuisine (one-hot encoded vector with a single feature as 1). I suspect there might be a mismatch in how cuisines are represented in the user preference vector and the restaurant data (e.g., single cuisine vs. list of cuisines).
def get_user_preferences(cuisine=None, price_range=None):
user_pref = [0] * len(encoder.get_feature_names_out())
if cuisine: cuisine_index = encoder.get_feature_names_out().tolist().index(cuisine) user_pref[cuisine_index] = 1
if price_range: user_pref.append(price_range) # Assuming price range is a numerical value
return user_pref
def get_user_preferences(cuisine=None, price_range=None): ist: User preference vector with one-hot encoded cuisine and price range. """ # Initialize an empty preference vector with an equal preference for all cuisines num_cuisines = len(encoder.get_feature_names_out()) user_pref = [1 / num_cuisines] * num_cuisines
# Set cuisine preference (one-hot encoding)
if cuisine:
cuisine_index = encoder.get_feature_names_out().tolist().index(cuisine)
user_pref[cuisine_index] = 1
# Set price range
if price_range:
user_pref.append(price_range) # Assuming price range is a numerical value
return user_pref
#3
def cosine_similarity(user_pref, restaurant_vec): # Extract cuisine vectors from user preference and restaurant vectors user_pref_cuisine = user_pref[:-1] if len(user_pref) > 1 else user_pref # Exclude price range if available restaurant_vec_cuisine = restaurant_vec[:-1] if len(restaurant_vec) > 1 else restaurant_vec
if len(user_pref_cuisine) != len(restaurant_vec_cuisine):
raise ValueError("User preference cuisine vector and restaurant cuisine vector must have the same number of dimensions.")
# Calculate cosine similarity
return np.dot(user_pref_cuisine, restaurant_vec_cuisine) / (np.linalg.norm(user_pref_cuisine) * np.linalg.norm(restaurant_vec_cuisine))
def get_user_preferences_from_user(): """ This function prompts the user to choose their preferred cuisine and price range.
Returns:
List: User preference vector with one-hot encoded cuisine and price range.
"""
# Get user input for cuisine
available_cuisines = list(df['Cuisines'].unique()) # Assuming unique cuisines are in a column
while True:
cuisine = input("Enter your preferred cuisine (or 'all' for any): ").lower().strip()
if cuisine in available_cuisines or cuisine == 'all':
break
else:
print(f"Invalid cuisine. Available options are: {', '.join(available_cuisines)}")
# Get user input for the price range
price_ranges = {
"1": "Cheap",
"2": "Moderate",
"3": "Fine Dining"
}
while True:
price_range = input("Enter your preferred price range (1 - Cheap, 2 - Moderate, 3 - Fine Dining): ")
if price_range in price_ranges:
break
else:
print(f"Invalid price range. Please enter 1, 2, or 3.")
# Encode cuisine (if not 'all')
if cuisine != 'all':
cuisine_index = encoder.get_feature_names_out().tolist().index(cuisine)
user_pref = [0] * len(encoder.get_feature_names_out())
user_pref[cuisine_index] = 1
else:
user_pref = [1] * len(encoder.get_feature_names_out()) # One-hot encode for all cuisines
# Add price range if specified
if price_range:
user_pref.append(int(price_range))
# Ensure that the user preference vector has the same number of features as the restaurant feature vectors
if len(user_pref) != len(df.drop(columns=['Restaurant Name']).values[0]):
raise ValueError("User preference vector and restaurant feature vectors must have the same number of dimensions.")
return user_pref
def recommend_restaurants(user_pref, df=df, top_n=5): """ This function recommends restaurants based on user preferences and similarity scores. Args: user_pref (list): User preference vector. df (pandas.DataFrame, optional): Preprocessed restaurant data. Defaults to df. top_n (int, optional): Number of top recommendations to return. Defaults to 5. Returns: pandas.DataFrame: Top N recommended restaurants with details. """ # Construct restaurant feature vectors including encoded cuisines restaurant_feature_vectors = df.drop(columns=['Restaurant Name']).values
# Check if the user preference vector and restaurant feature vectors have the same number of dimensions
if len(user_pref) != len(restaurant_feature_vectors[0]):
raise ValueError("User preference vector and restaurant feature vectors must have the same number of dimensions.")
# Calculate similarity scores for each restaurant
similarities = np.array([cosine_similarity(user_pref, vec) for vec in restaurant_feature_vectors])
# Sort restaurants by similarity score (descending)
df_sorted = df.assign(similarity=similarities).sort_values(by='similarity', ascending=False)
# Return top N recommendations
return df_sorted.head(top_n).drop('similarity', axis=1)
user_pref = get_user_preferences_from_user() # Replace with your function
recommendations = recommend_restaurants(user_pref, df=df) # Use your data in df
print("Top Restaurant Recommendations:") print(recommendations)
ValueError Traceback (most recent call last) in <cell line: 2>() 1 # Assuming you have a function to get user input for cuisine and price range ----> 2 user_pref = get_user_preferences_from_user() # Replace with your function 3 4 # Call the recommend_restaurants function with your data and user preferences 5 recommendations = recommend_restaurants(user_pref, df=df) # Use your data in df
in get_user_preferences_from_user() 42 # Ensure that the user preference vector has the same number of features as the restaurant feature vectors 43 if len(user_pref) != len(df.drop(columns=['Restaurant Name']).values[0]): ---> 44 raise ValueError("User preference vector and restaurant feature vectors must have the same number of dimensions.") 45 46 return user_pref
ValueError: User preference vector and restaurant feature vectors must have the same number of dimensions.
Upvotes: 0
Views: 27
Reputation: 1
This post addresses potential causes and fixes for issues that might arise in the provided cosine similarity function:
Problem: The user_pref or restaurant_vec might contain strings or non-numerical data types. Fix: Convert these values to floats using a list comprehension: Python user_pref = [float(x) for x in user_pref]
Problem: The np.linalg.norm function can raise errors with empty vectors (all zeros). Fix: The code includes a check for this, but consider improving clarity. Improved Code Snippet:
Python def cosine_similarity(user_pref, restaurant_vec):
if not any(user_pref) or not any(restaurant_vec): return 0 # Avoid division by zero
user_pref = np.array(user_pref, dtype=float) restaurant_vec = np.array(restaurant_vec, dtype=float)
return np.dot(user_pref, restaurant_vec) / (np.linalg.norm(user_pref)
Upvotes: 0