Reputation: 67
I am new in python so I don't really know how to manipulate arrays. I have a large JSON file than contains geographic location an example of this is:
{"items":[{"language":"en","created":"2013-12-17T09:31:31.000+01:00","geometry":{"type":"Point","coordinates":[9.2056232,45.4825264]}
And also I have another file that contains coordinates i.e
4c29e1c197d00f47a60442ea,Area51Lab Srl,4bf58dd8d48988d124941735,45.44826958,9.144208431
I want to calculate the shortest distance between coordinates in file 1 and coordinates in file 2 to generate a final file with the shortest distances.
Upvotes: 1
Views: 1475
Reputation: 7275
import pandas as pd
from vincenty import vincenty
df1 = pd.read_json(data.json)
df2 = pd.read_csv(data.csv)
results = []
for i1, d1 in df1.iterrows():
for i2, d2 in df2.iterrows():
distances.append({
"index1": i1,
"index2": i2,
"results": vincenty((d1.coordinates[0], d1.coordinates[1])
(d2.latitude, d2.longitude)) # you will need to adapt this part
})
results = df.DataFrame(results)
results = results.groupby(["index1", "index2"]).results.min()
results.to_csv("results.csv")
# or
results.to_json("results.json")
Vincenty's formula uses a more accurate representation of Earth than Halversine's/Great-Circle so is generally more accurate.
If you don't have Pandas you should consider installing Anaconda. It's a Python distro for scientific computing and is all around pretty great – especially on Windows.
Upvotes: 2
Reputation: 353
First you have to extract the latitude and longitude in your files. See json module for a json file for example. https://docs.python.org/2/library/json.html
To calculate the distance between two points on a sphere given the angles (latitude and longitude...), you can use the haversine formula. https://en.wikipedia.org/wiki/Haversine_formula
There is a javascript implementation here http://www.movable-type.co.uk/scripts/latlong.html that you can adapt to python.
Upvotes: 0