Reputation: 33
I am writing a code to calculate the shortest distance between two sets of points. Essentially, I have created a csv with a bunch of locations in coordinates, and a second csv with a second bunch of locations in coordinates. For example, the coordinates in list A could be (50, -10), (60, 70), (40, -19) and in list B, it could be (40, 87), (60, 90), (23, 20). Everything I have found online to help me calculates between a list and a single point: this won't work for me.
So far I am able to calculate the distance between all the points (so between A1 and B1, A1 and B2, A1 and B3, A2 and B1, etc). That's fine, but what I want is the minimum distance from point 1 in list A to ANY point in list B. Essentially, what position in list B is closest to each point in list A?
I'm trying to find a way to run it so it checks A1 against B1, B2, B3 etc, and then comes back with the shortest distance being x miles between A1 and B3, for example.
What I have so far is below:
import pandas as pd
import geopy.distance
df = pd.read_csv('AirportCoords.csv')
df2 = pd.read_csv('HotelCoords.csv')
for i,row in df2.iterrows():
coordinate = row.lat, row.long
for i,row in df.iterrows():
coordinate2 = row.latitude, row.longitude
distance = geopy.distance.geodesic(coordinate, coordinate2).km
print(distance)
Upvotes: 1
Views: 1755
Reputation: 8405
You're talking about comparing every element of A to every element of B, this implies that you should have a nested loop, but your example code actually has 2 loops in sequence.
import pandas as pd
import geopy.distance
df = pd.read_csv('AirportCoords.csv')
df2 = pd.read_csv('HotelCoords.csv')
for i,row in df.iterrows(): # A
a = row.latitude, row.longitude
distances = []
for j,row2 in df2.iterrows(): # B
b = row2.lat, row2.long
distances.append(geopy.distance.geodesic(a, b).km)
min_distance = min(distances)
min_index = distances.index(min_distance)
print("A", i, "is closest to B", min_index, min_distance, "km")
Upvotes: 4