Reputation: 818
I have these two data frames in python and I'm trying to calculate the Manhattan distance and later on the Euclidean distance, but I'm stuck in this Manhattan distance and can't figure it out what is going wrong.
Here is what I have tried so far:
ratings = pd.read_csv("toy_ratings.csv", ",")
person1 = ratings[ratings['Person'] == 1]['Rating']
person2 = ratings[ratings['Person'] == 2]['Rating']
ratings.head()
Person Movie Rating
0 1 11 2.5
1 1 12 3.5
2 1 15 2.5
3 3 14 3.5
4 2 12 3.5
Here is data inside the person1
and person2
print("*****person1*****")
print(person1)
*****person1*****
0 2.5
1 3.5
2 2.5
5 3.0
22 3.5
23 3.0
36 5.0
print("*****person2*****")
print(person2)
*****person2*****
4 3.5
6 3.0
8 1.5
9 5.0
11 3.0
24 3.5
This was the function that I have tried to build without any luck:
def ManhattanDist(person1, person2):
distance = 0
for rating in person1:
if rating in person2:
distance += abs(person1[rating] - person2[rating])
return distance
The thing is that the function gives 0 back and this is not correct, when I debug I can see that it never enters the second loop. How can I perform a check to see the both rows has a value and loop?
Upvotes: 0
Views: 6164
Reputation: 1978
I think the function should give back (= return) the distance in any case: either the distance is zero as initiated, or it is is somethhing else. So the function should look like
def ManhattanDist(person1, person2):
distance = 0
for rating in person1:
if rating in person2:
distance += abs(person1[rating] - person2[rating])
return distance
I think the distance should be built by two vectors of the same length (at least I cannot imagine any thing else). If this is the case you can do (without your function)
import numpy as np
p1 = np.array(person1)
p2 = np.array(person2)
#--- scalar product as similarity indicator
dist1 = np.dot(p1,p2)
#--- Euclidean distance
dist2 = np.linalg.norm(p1-p2)
#--- manhatten distance
dist3 = np.sum(np.abs(p1-p2))
Upvotes: 2
Reputation: 5279
You function is returning 1 value ... It should (I guess) return a list of values.
Upvotes: 0