Niko Gamulin
Niko Gamulin

Reputation: 66565

Applying function to the subsequent rows

I want to calculate the distance and the average speed of movements for the provided list of locations.

The datafrane looks as follows:

                             GpsLatitude  GpsLongitude  TotalWorkingHours
DateTime                                                               
2018-11-16 14:30:23+00:00    46.022116     20.093600                NaN
2018-11-16 14:30:31+00:00    46.022109     20.093605             359.53
2018-11-16 14:30:41+00:00    46.022103     20.093602             359.53
2018-11-16 14:37:21+00:00    46.022124     20.093568             359.53
2018-11-16 14:37:31+00:00    46.022123     20.093566             359.53

The function for calculating the distance is the following:

def get_distance_from_geopoints(latlng1, latlng2):
    lat1, lng1 = latlng1
    lat2, lng2 = latlng2
    R = 6373.0
    dlng = lng2 - lng1
    dlat = lat2 - lat1

    a = sin(dlat / 2)**2 + cos(lat1) * cos(lat2) * sin(dlng / 2)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))

    distance = R * c * 1000
    return distance

To create a dataframe with rows that contain starting position, ending position, distance, and average speed, I wrote the following:

df_rolling = df.rolling(window=2)
df_activity['distance_traveled'] = df_rolling.apply(lambda x: get_distance_from_geopoints((x[0]['GpsLatitude'], x[0]['GpsLongitude']), (x[1]['GpsLatitude'], x[1]['GpsLongitude'])), axis=1)
df_activity['time_difference_h'] = df_rolling['TotalWorkingHours'].apply(lambda x: x[1] - x[0])
df_activity['GpsLatitudeStart'] = df_rolling['GpsLatitude'].apply(lambda x: x[0])
df_activity['GpsLongitudeStart'] = df_rolling['GpsLongitude'].apply(lambda x: x[0])
df_activity['GpsLatitudeEnd'] = df_rolling['GpsLatitude'].apply(lambda x: x[1])
df_activity['GpsLongitudeEnd'] = df_rolling['GpsLongitude'].apply(lambda x: x[1])
df_activity['average_speed_kmh'] = df_serial.apply(lambda x: x['distance_traveled']/x['time_difference_h'])

Running the above, I get the error:

Traceback (most recent call last) in ----> 1 df.rolling(window=2).apply(lambda x: get_distance_from_geopoints((x[0].GpsLatitude, x[0].GpsLongitude), (x[1].GpsLatitude, x[1].GpsLongitude)), axis=1)

TypeError: apply() got an unexpected keyword argument 'axis'

Also, applying a rolling window for every transformation, I believe is not optimal in terms of execution time so I am trying to find an alternative to apply rolling window once to get an intermediate transformation and then use that to extract starting point, ending point, calculate the time and distance difference and the average speed.

Upvotes: 0

Views: 45

Answers (1)

Scott Boston
Scott Boston

Reputation: 153460

Let's use shift move the next row up to current row for alignment and calculations.

df = pd.DataFrame({'col1':np.arange(10)})

df['col1'] + df['col1'].shift(-1).fillna(0)*10

Output:

   col1  nextSum
0     0     10.0
1     1     21.0
2     2     32.0
3     3     43.0
4     4     54.0
5     5     65.0
6     6     76.0
7     7     87.0
8     8     98.0
9     9      9.0

Upvotes: 3

Related Questions