saving_space
saving_space

Reputation: 168

Error in For Loop Logic for a Column in Pandas Python 3

I have a list of locations. For every location in the location column, there is a function which finds its coordinates, if they are not there already. This operation is performed for all. The loop copies the last value of latitudes and longitudes in all the rows, which it shouldn't do. Where am I making the mistake?

What I have

location           gpslat gpslong
Brandenburger Tor  na     na
India Gate         na     na
Gateway of India   na     na

What I want

location           gpslat gpslong
Brandenburger Tor  52.16  13.37
India Gate         28.61  77.22
Gateway of India   18.92  72.81

What I get

location           gpslat gpslong
Brandenburger Tor  18.92  72.81
India Gate         18.92  72.81
Gateway of India   18.92  72.81

My Code

i = 0
for location in df.location_clean:
    try:
        if np.isnan(float(df.gpslat.iloc[i])) == True:
                df.gpslat.iloc[i], df.gpslong.iloc[i] = find_coordinates(location)
                print(i, "Coordinates Found for --->", df.location_clean.iloc[i])
        else:
            print(i,'Coordinates Already Present')
    except:
        print('Spelling Mistake Encountered at', df.location_clean.iloc[i], 'Moving to NEXT row')
        continue
    i = i + 1

I guess, I am making a logical error either with the index i or the statement df.gpslat.iloc[i], df.gpslong.iloc[i] = find_coordinates(location) . I tried changing them and rerunning the loop, but it's same. It's also a time consuming process as there are thousands of locations.

Upvotes: 0

Views: 51

Answers (1)

Cribber
Cribber

Reputation: 2913

It is hard to help without seeing the data, but this might help you.

  • In the future, please provide a minimal example of your data so we can work with it and help you better.
  • Furthermore, you should never use 'except' without supplying the exact error - in this case your except catches ALL errors, even when there is something else wrong besides your "spelling error" - without you noticing it!
  • When iterating over a dataframe, use iterrows() - it makes it a lot more readable and you don't have to use extra variables
  • using iloc opens you up to pandas' SettingWithCopyWarning ( see here: https://www.dataquest.io/blog/settingwithcopywarning/), try to avoid it.

Here is the code:

# ____ Preparation ____ 
import pandas as pd
import numpy as np

lst = [['name1', 1, 3]
      ,['name2',1, 3]
      ,['name3',None, None]
      ,['name4',1, 3]
       ]
df = pd.DataFrame(lst,    columns =['location', 'gpslat', 'gpslong',])
print(df.head())

# ____ YOUR CODE ____ 
for index, row in df.iterrows():
       try:
              if np.isnan(float(row['gpslat'])) == True:
                     lat, long = find_coordinates(row['location'])
                     print(lat,long)
                     df.at[index, 'gpslat'] = lat
                     df.at[index, 'gpslong'] = long

       except TypeError:  # exchange this with the exact error which you want to  catch
              print('Spelling Mistake Encountered at', row['location'], ' in row ', index, 'Moving to NEXT row')
              continue
print(df.head())

Upvotes: 1

Related Questions