G.alshammari
G.alshammari

Reputation: 23

single positional indexer is out-of-bounds

with open('similarities/EuclideanSimilarity.csv', 'w') as result_file:

print('user1,user2,similarity', file=result_file)

print('Calculating similarities between users...')

for u1 in tqdm(users, total=len(users)):

for u2 in users:

ratings1 = np.nan_to_num(np.array(user_ratings_matrix.iloc[u1 - 1].values))

ratings2 = np.nan_to_num(np.array(user_ratings_matrix.iloc[u2 - 1].values))

              sim = 1 / (1 + distance.euclidean(ratings1, ratings2))
                print(f"{u1},{u2},{sim}", file=result_file)"

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py in getitem(self, key) 1371 1372 maybe_callable = com._apply_if_callable(key, self.obj) -> 1373 return self._getitem_axis(maybe_callable, axis=axis) 1374 1375 def _is_scalar_access(self, key):

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis) 1828 1829 # validate the location -> 1830 self._is_valid_integer(key, axis) 1831 1832 return self._get_loc(key, axis=axis)

~/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py in _is_valid_integer(self, key, axis) 1711 l = len(ax) 1712 if key >= l or key < -l: -> 1713 raise IndexError("single positional indexer is out-of-bounds") 1714 return True 1715

IndexError: single positional indexer is out-of-bounds

Upvotes: 1

Views: 1648

Answers (1)

tel
tel

Reputation: 13999

You don't give enough information about the type/contents of users or user_ratings_matrix to reliably answer your question. If I assume that users is a list of userIDs, and that user_ratings_matrix is a standard Pandas DataFrame that is in the same order as users, then you can rewrite your for loops as so:

for u1,row1 in tqdm(zip(users, user_ratings_matrix.itertuples(index=False, name=None)), total=len(users)):
    for u2,row2 in zip(users, user_ratings_matrix.itertuples(index=False, name=None)):
        ratings1 = np.nan_to_num(np.array(row1))
        ratings2 = np.nan_to_num(np.array(row2))
        sim = 1 / (1 + distance.euclidean(ratings1, ratings2))
        print(f"{u1},{u2},{sim}", file=result_file)"

Explanation

user_ratings_matrix.itertuples(index=False, name=None) will iterate over the rows in your dataframe and return each as a tuple.

zip(users, user_ratings_matrix.itertuples(index=False, name=None)) will iterate over the pairs of (userID, tuple(dataframe_row)).

Also, before the next time you post a question on SO, you should probably read these guidelines about how to produce an example that other people can run/work with. It'll help you to get better answers on this site.

Upvotes: 0

Related Questions