Reputation: 93
I have a dataframe with a column 'geometry.coordinates' and each row contains a list of lat long coordinate pairs, example:
[[[120.789558, 17.41699],
[120.761307, 17.416771],
[120.744881, 17.437571],
[120.745842, 17.44907],
[120.727699, 17.457621],
[120.73217, 17.463221],
[120.762817, 17.54215],
[120.791496, 17.54603],
[120.817032, 17.51009],
[120.884644, 17.469419],
[120.789223, 17.44525],
[120.789558, 17.41699]]]
I want to create another column that contains the average of all latitudes and longitudes in the list. How do I do that?
Sample rows in the DataFrame:
+----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | properties.NAME_2 | geometry.coordinates |
|----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 0 | Sallapadan | [[[120.789558, 17.41699], [120.761307, 17.416771], [120.744881, 17.437571], [120.745842, 17.44907], [120.727699, 17.457621], [120.73217, 17.463221], [120.762817, 17.54215], [120.791496, 17.54603], [120.817032, 17.51009], [120.884644, 17.469419], [120.789223, 17.44525], [120.789558, 17.41699]]] |
| 1 | San Isidro | [[[120.630783, 17.43194], [120.578957, 17.44137], [120.584541, 17.476851], [120.584137, 17.48283], [120.605492, 17.502029], [120.615356, 17.494249], [120.672997, 17.49074], [120.673241, 17.46966], [120.618919, 17.46871], [120.621872, 17.446251], [120.630783, 17.43194]]] |
| 2 | San Juan | [[[120.782753, 17.71497], [120.779747, 17.66584], [120.724838, 17.665701], [120.707687, 17.687611], [120.712273, 17.711361], [120.711327, 17.726721], [120.721786, 17.732639], [120.750092, 17.724319], [120.785896, 17.76413], [120.811371, 17.740749], [120.782753, 17.71497]]] |
+----+---------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
New column would be something like this:
Center
120.778018916667, 17.4642644166667
120.619734363636, 17.4669609090909
120.751865727273, 17.7135464545455
Upvotes: 1
Views: 347
Reputation: 863741
Use np.mean
with axis=1
and select nested lists by [0]
:
df['geometry.coordinates']=df['geometry.coordinates'].apply(lambda x: np.mean(x, axis=1)[0])
print (df)
properties.NAME_2 geometry.coordinates
0 Sallapadan [120.77801891666665, 17.464264416666666]
1 San Isidro [120.61973436363637, 17.466960909090908]
2 San Juan [120.75186572727272, 17.713546454545455]
Upvotes: 1