NWWPA
NWWPA

Reputation: 59

Geopy, pandas, FOR loop fail

I am teaching myself geopy. It seems simple and straightforward yet my code isn't working. It is supposed to:

    #setup
    from geopy.geocoders import Nominatim
    import pandas as pd
        
    #create the df
    df = pd.DataFrame(pd.read_csv('properties to geocode.csv'))
    df['Location'] = df['Street Address'].astype(str)+","+df['City'].astype(str)+","+df['State'].astype(str)
        
    #create the geolocator object
    geolocator = Nominatim(timeout=1, user_agent = "My_Agent")
        
    #create the locations list
    locations = df['Location']
        
    #empty lists for later columns
    lats = []
    longs = []
        
    #process the location list
    for item in locations: 
        location = geolocator.geocode('item')
        lat =  location.latitude
        long = location.longitude
        lats.append(lat)
        longs.append(long)
        
    #add the lists to the df
    df.insert(5,'Latitude',lats)
    df.insert(6,'Longitude',longs)
        
    #export
    df.to_csv('geocoded-properties2.csv',index=False)

Something is not working because it returns the same latitude and longitude values for every row, instead of unique coordinates for each.

I have found working code using .apply elsewhere but am interested in learning what I did wrong. Any thoughts?

Upvotes: 0

Views: 528

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31166

  • your code does not contain sample data. Have used some sample data available from public APIs to demonstrate
  • your code passes a literal to geolocator.geocode() - it needs to be the address associated with the row
  • have provided example of using with pandas apply, a list comprehension and a for loop equivalent of a comprehension
  • results show all three approaches are equivalent
from geopy.geocoders import Nominatim
import requests
import pandas as pd

searchendpoint = "https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations"
# get all healthcare facilities in Herefordshire
dfhc = pd.concat([pd.json_normalize(requests
                             .get(searchendpoint, params={"PostCode":f"HR{i}","Status":"Active"})
                             .json()["Organisations"]) 
           for i in range(1,10)]).reset_index(drop=True)

def gps(url, geolocator=None):
    # get the address and construct a space delimted string
    a = " ".join(str(x) for x in requests.get(url).json()["Organisation"]["GeoLoc"]["Location"].values())
    lonlat = geolocator.geocode(a)
    if not lonlat is None:
        return lonlat[1]
    else:
        return (0,0)

# work with just GPs
dfgp = dfhc.loc[dfhc.PrimaryRoleId.isin(["RO180","RO96"])].head(5).copy()

geolocator = Nominatim(timeout=1, user_agent = "My_Agent")


# pandas apply
dfgp["lonlat_apply"] = dfgp["OrgLink"].apply(gps, geolocator=geolocator)

# list comprehension
lonlat = [gps(url, geolocator=geolocator) for url in dfgp["OrgLink"].values]
dfgp["lonlat_listcomp"] = lonlat

# old school loop
lonlat = []
for item in dfgp["OrgLink"].values:
    lonlat.append(gps(item, geolocator=geolocator))
dfgp["lonlat_oldschool"] = lonlat

Name OrgId Status OrgRecordClass PostCode LastChangeDate PrimaryRoleId PrimaryRoleDescription OrgLink lonlat_apply lonlat_listcomp lonlat_oldschool
7 AYLESTONE HILL SURGERY M81026002 Active RC2 HR1 1HR 2020-03-19 RO96 BRANCH SURGERY https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/M81026002 (52.0612429, -2.7026047) (52.0612429, -2.7026047) (52.0612429, -2.7026047)
9 BARRS COURT SCHOOL 5CN91 Active RC2 HR1 1EQ 2021-01-28 RO180 PRIMARY CARE TRUST SITE https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN91 (52.0619209, -2.7086105) (52.0619209, -2.7086105) (52.0619209, -2.7086105)
13 BODENHAM SURGERY 5CN24 Active RC2 HR1 3JU 2013-05-08 RO180 PRIMARY CARE TRUST SITE https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN24 (52.152405, -2.6671942) (52.152405, -2.6671942) (52.152405, -2.6671942)
22 BELMONT ABBEY 5CN16 Active RC2 HR2 9RP 2013-05-08 RO180 PRIMARY CARE TRUST SITE https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN16 (52.0423056, -2.7648698) (52.0423056, -2.7648698) (52.0423056, -2.7648698)
24 BELMONT HEALTH CENTRE 5CN22 Active RC2 HR2 7XT 2013-05-08 RO180 PRIMARY CARE TRUST SITE https://directory.spineservices.nhs.uk/ORD/2-0-0/organisations/5CN22 (52.0407746, -2.739788) (52.0407746, -2.739788) (52.0407746, -2.739788)

Upvotes: 1

Related Questions