pabloFerro
pabloFerro

Reputation: 13

Row iteration over a dataframe to calculate values and add them to new column

Steps till goal: create a for loop to go through every row in the dataframe and:

  1. take X and Y column values to use them in the function
  2. the function will generate a Longitud and Latitud values
  3. adding those values in the same row in new columns called "Lat" and "Lon"

At the moment, step 1 and 2 are working, but I can't get the append of every value in every column

What I have tried is:

Definition to use in the loop

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=28, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    print( lonlat[1],lonlat[0])

Loop:

for _, row in df.iterrows():
    xy_to_lonlat(row['X'],row['Y'])

This is the output, that is perfect:

28.667978631874004 -17.96430510323817
28.67957708337043 -17.96589718293177
28.680075373251725 -17.96652237896143
28.696094952446764 -17.971279315586795

But I need to introduce these 2 values into df, exactly into df['Lat'] and df['Lon']

What I have tried is to append() them in lists that later I will insert into df, but it doesn't work:

aLongitud=[]
aLatitud=[]

for _, row in df.iterrows():
    xy_to_lonlat(row['X'],row['Y'])
    aLongitud.append(lonlat[1])
    aLatitud.append(lonlat[0])

This is how df looks like: how df.head is printed

The function works with the 52 rows, I just need to get them into 2 new columns in the df:

28.667978631874004 -17.96430510323817
28.67957708337043 -17.96589718293177
28.680075373251725 -17.96652237896143
28.696094952446764 -17.971279315586795
28.69709953128404 -17.97089438970623
28.704102246479206 -17.97502030269029
28.714190480593878 -17.98059681820521
28.84284299081375 -17.943724718418043
28.85522495646711 -17.907748758676934
28.85497605095961 -17.915999785074945
28.834039353212727 -17.853402778875363
28.84368320877517 -17.790724992980966
28.8311955800612 -17.773218425619255
28.757725903465193 -17.735394629644425
28.75694932761218 -17.734865031953948
28.651232614536056 -17.75864104734293
28.647850336922037 -17.75586691138396
28.64510111053916 -17.756973867003158
28.54740295444906 -17.779646961686794
28.481011316595747 -17.871383348460515
28.598084805574075 -17.92779850800547
28.84869842152646 -17.898800401690675
28.730123181880874 -17.72687292142767
28.65501749037169 -17.759807688028065
28.586115587686052 -17.755714748146353
28.855549587948108 -17.90757529900783
28.62104314133748 -17.750679106650242
28.805231369924527 -17.76049570914483
28.842322764567797 -17.794590436117428
28.654662237239517 -17.761368473029265
28.652716177555675 -17.954686156568993
28.84441637529699 -17.789637146820752
28.812367721581616 -17.763087214328706
28.80648375432461 -17.75977264125206
28.713070037952928 -17.74394044409638
28.850159557661478 -17.898032389327415
28.84268417328949 -17.884610248902643
28.506075965709968 -17.87932721318885
28.60916367244466 -17.92715257476472
28.508055636889907 -17.879126662123344
28.593688218530882 -17.755496249789623
28.614870490264675 -17.753636080872226
28.453393338804933 -17.83975500058191
28.81927942283548 -17.97071265399719
28.632049774803967 -17.948276230580895
28.810197401802437 -17.7626526992656
28.81013751332894 -17.762176710792335
28.651195000175182 -17.757862000230173
28.491243000164914 -17.874658000300624
28.523693000166094 -17.87819700030406
28.56082500016691 -17.89452700031706
28.53126600016634 -17.878297000304332

How the df looks after looping the function >> The "none" issue: enter image description here

Upvotes: 0

Views: 1956

Answers (1)

user6386471
user6386471

Reputation: 1253

This solution uses your existing xy_tolonlat() function with the pandas DataFrame apply method:

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=28, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[1],lonlat[0]

# I just made up this data
xs = [21000,21020,23000]
ys = [3000000,3000050,3000100]
df = pd.DataFrame({'X':xs,'Y':ys})

df['lat_lon'] = df.apply(lambda r: xy_to_lonlat(r['X'],r['Y']),axis=1)
df['Lat'] = df['lat_lon'].apply(lambda x: x[0])
df['Lon'] = df['lat_lon'].apply(lambda x: x[1])
df = df.drop('lat_lon',axis=1)

df

#        X        Y        Lat        Lon
# 0  21000  3000000  27.039540 -19.826207
# 1  21020  3000050  27.039996 -19.826026
# 2  23000  3000100  27.041129 -19.806152

Upvotes: 1

Related Questions