calculate averages from pandas column whose elements contain list of coordinates

Question

I have a dataframe with a column 'geometry.coordinates' and each row contains a list of lat long coordinate pairs, example:

[[[120.789558, 17.41699],
  [120.761307, 17.416771],
  [120.744881, 17.437571],
  [120.745842, 17.44907],
  [120.727699, 17.457621],
  [120.73217, 17.463221],
  [120.762817, 17.54215],
  [120.791496, 17.54603],
  [120.817032, 17.51009],
  [120.884644, 17.469419],
  [120.789223, 17.44525],
  [120.789558, 17.41699]]]

I want to create another column that contains the average of all latitudes and longitudes in the list. How do I do that?

Sample rows in the DataFrame:

+----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|    | properties.NAME_2   | geometry.coordinates                                                                                                                                                                                                                                                                                   |
|----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|  0 | Sallapadan          | [[[120.789558, 17.41699], [120.761307, 17.416771], [120.744881, 17.437571], [120.745842, 17.44907], [120.727699, 17.457621], [120.73217, 17.463221], [120.762817, 17.54215], [120.791496, 17.54603], [120.817032, 17.51009], [120.884644, 17.469419], [120.789223, 17.44525], [120.789558, 17.41699]]] |
|  1 | San Isidro          | [[[120.630783, 17.43194], [120.578957, 17.44137], [120.584541, 17.476851], [120.584137, 17.48283], [120.605492, 17.502029], [120.615356, 17.494249], [120.672997, 17.49074], [120.673241, 17.46966], [120.618919, 17.46871], [120.621872, 17.446251], [120.630783, 17.43194]]]                         |
|  2 | San Juan            | [[[120.782753, 17.71497], [120.779747, 17.66584], [120.724838, 17.665701], [120.707687, 17.687611], [120.712273, 17.711361], [120.711327, 17.726721], [120.721786, 17.732639], [120.750092, 17.724319], [120.785896, 17.76413], [120.811371, 17.740749], [120.782753, 17.71497]]]                      |
+----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

New column would be something like this:

Center
120.778018916667, 17.4642644166667
120.619734363636, 17.4669609090909
120.751865727273, 17.7135464545455

jezrael · Accepted Answer

Use np.mean with axis=1 and select nested lists by [0]:

df['geometry.coordinates']=df['geometry.coordinates'].apply(lambda x: np.mean(x, axis=1)[0])
print (df)
  properties.NAME_2                      geometry.coordinates
0        Sallapadan  [120.77801891666665, 17.464264416666666]
1        San Isidro  [120.61973436363637, 17.466960909090908]
2          San Juan  [120.75186572727272, 17.713546454545455]

calculate averages from pandas column whose elements contain list of coordinates

Answers (1)

Related Questions