Reputation: 109
I get this error
File "C:\Users\Morakinyo\.vscode\extensions\ms-python.python-2018.3.1\pythonFiles\experimental\ptvsd\ptvsd\pydevd\_pydev_imps\_pydev_execfile.py", line 25, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "c:\Users\Morakinyo\Documents\recommend\Movie-Recommender-master\movie-reco.py", line 26, in train_data_matrix[line[0]-1, line[1]-1] = line[2] TypeError: unsupported operand type(s) for -: 'str' and 'int'
when executing this code:
import numpy as np
import pandas as pd
header = ['user_id', 'item_id', 'rating', 'timestamp']
df = pd.read_csv('/Users/Morakinyo/Documents/recommend/Movie-Recommender-master/u.data', sep='\t', names=header)
n_users = df.user_id.unique().shape[0]
n_items = df.item_id.unique().shape[0]
print ('Number of users = ' + str(n_users) + ' | Number of movies = ' + str(n_items) )
from sklearn import model_selection as cv
train_data, test_data = cv.train_test_split(df, test_size=0.25)
train_data_matrix = np.zeros((n_users, n_items))
for line in train_data:
train_data_matrix[line[0]-1, line[1]-1] = line[2]
test_data_matrix = np.zeros((n_users, n_items))
for line in test_data:
test_data_matrix[line[0]-1, line[1]-1] = line[2]
from sklearn.metrics.pairwise import pairwise_distances
user_similarity = pairwise_distances(train_data_matrix, metric='cosine')
def predict(ratings, similarity, type='user'):
if type == 'user':
mean_user_rating = ratings.mean(axis=1)
ratings_diff = (ratings - mean_user_rating[:, np.newaxis])
pred = mean_user_rating[:, np.newaxis] + similarity.dot(ratings_diff) / np.array([np.abs(similarity).sum(axis=1)]).T
return pred
user_prediction = predict(train_data_matrix, user_similarity, type='user')
from sklearn.metrics import mean_squared_error
from math import sqrt
def rmse(prediction, ground_truth):
prediction = prediction[ground_truth.nonzero()].flatten()
ground_truth = ground_truth[ground_truth.nonzero()].flatten()
return sqrt(mean_squared_error(prediction, ground_truth))
print ('User-based CF RMSE: ' + str(rmse(user_prediction, test_data_matrix)))
I can not figure out where the problem is.
Upvotes: 2
Views: 172
Reputation: 324
You are trying to subtract an integer from a string. The error message should include the actual line in which the error occurs. If the string is indeed an integer you can convert it using int("2")
Always try to include the full error message in a question. This makes it easiert to point to the error directly.
Edit: since you provided the full output this is the failing line:
train_data_matrix[line[0]-1, line[1]-1] = line[2]
If line actually contains numbers you can use:
train_data_matrix[int(line[0])-1, int(line[1])-1] = line[2]
Another option would be to convert all the entries in the training data before working with it.
Upvotes: 1