Calculate distance between consecutive GPS points and reduce GPS density based on this distance

Question

I have a pandas dataframe that represents the GPS trajectory of a vehicle

d1 = {'id': [1, 2, 3, 4, 5, 6, 7, 8, 9], 'longitude': [4.929783, 4.932333, 4.933950, 4.933900, 4.928467, 4.924583, 4.922133, 4.921400, 4.920967], 'latitude': [52.372250, 52.370884, 52.371101, 52.372234, 52.375282, 52.375950, 52.376301, 52.376232, 52.374481]}
df1 = pd.DataFrame(data=d1)

id   longitude   latitude     
1    4.929783    52.372250    
2    4.932333    52.370884    
3    4.933950    52.371101    
4    4.933900    52.372234    
5    4.928467    52.375282    
6    4.924583    52.375950    
7    4.922133    52.376301    
8    4.921400    52.376232    
9    4.920967    52.374481

I already calculated the (haversine) distance in meters between consecutive GPS points as follows:

import numpy as np
def haversine(lat1, lon1, lat2, lon2, earth_radius=6371):
    lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2
    km = earth_radius * 2 * np.arcsin(np.sqrt(a))
    m = km * 1000
    return m

df1['distance'] = haversine(df1['latitude'], df1['longitude'],
                       df1['latitude'].shift(), df1['longitude'].shift())

id  longitude   latitude    distance
1   4.929783    52.372250   NaN
2   4.932333    52.370884   230.305288
3   4.933950    52.371101   112.398101
4   4.933900    52.372234   126.029572
5   4.928467    52.375282   500.896578
6   4.924583    52.375950   273.918990
7   4.922133    52.376301   170.828592
8   4.921400    52.376232   50.345227
9   4.920967    52.374481   196.908503

Now I would like to create a function that

removes the second, i.e. the following point if the distance between consecutive GPS points is less than 150 meters.
always keep the last (and the first) GPS point, regardless of the distance between the previous kept feature

Meaning this should be the output:

id  longitude   latitude    distance
1   4.929783    52.372250   NaN
2   4.932333    52.370884   230.305288
5   4.928467    52.375282   500.896578
6   4.924583    52.375950   273.918990
7   4.922133    52.376301   170.828592
9   4.920967    52.374481   196.908503

What is the best way to achieve this in python?

Calculate distance between consecutive GPS points and reduce GPS density based on this distance

Answers (1)

Distance

The Loop

The Results

Related Questions