Danish
Danish

Reputation: 2871

Calculate the distance travelled by each inspector using lat, long of tracking data using groupby pandas

I have a dataframe as shown below. Which is inspector tracking data (latitude and longitude) almost in each minute.

Inspector_ID   Timestamp               Latitude      Longitude
1              2018-07-24 7:31:00      100.491491    13.725239
1              2018-07-24 7:31:01      101.491491    15.725239
1              2018-07-24 7:32:04      104.491491    14.725239
1              2018-07-24 7:33:06      102.491491    10.725239
2              2018-07-24 8:35:08      105.491491    8.725239
2              2018-07-24 8:36:10      101.491491    15.725239
2              2018-07-24 8:37:09      101.491491    12.725239
2              2018-07-24 8:39:00      106.491491    16.725239

From the above data I would like find out the distance travelled by an inspector from each consecutive tracking data (latitude and longitude of the inspector).

Expected output (example)

Inspector_ID      Timestamp               Latitude      Longitude    Distance
    1              2018-07-24 7:31:00      100.491491    13.725239    nan
    1              2018-07-24 7:31:01      101.491491    15.725239    2.3
    1              2018-07-24 7:32:04      104.491491    14.725239    1.2
    1              2018-07-24 7:33:06      102.491491    10.725239    3.6
    2              2018-07-24 8:35:08      105.491491    8.725239     nan
    2              2018-07-24 8:36:10      101.491491    15.725239    5.6
    2              2018-07-24 8:37:09      101.491491    12.725239    2.1
    2              2018-07-24 8:39:00      106.491491    16.725239    3

Here I would like to calculate the distance groupby Inspector_ID.

Note: The number populated in Distance column are not the correct distance.

I am not aware, how to calculate distance using latitude and longitude. I am very new in pandas as well.

Upvotes: 0

Views: 186

Answers (1)

ansev
ansev

Reputation: 30930

Use GroupBy.diff:

df['distance']=df.groupby('Inspector_ID')[['Latitude','Longitude']].diff().pow(2).sum(axis=1,min_count=1).pow(1/2)
print(df)

   Inspector_ID           Timestamp    Latitude  Longitude  distance
0             1 2018-07-24 07:31:00  100.491491  13.725239       NaN
1             1 2018-07-24 07:31:01  101.491491  15.725239  2.236068
2             1 2018-07-24 07:32:04  104.491491  14.725239  3.162278
3             1 2018-07-24 07:33:06  102.491491  10.725239  4.472136
4             2 2018-07-24 08:35:08  105.491491   8.725239       NaN
5             2 2018-07-24 08:36:10  101.491491  15.725239  8.062258
6             2 2018-07-24 08:37:09  101.491491  12.725239  3.000000
7             2 2018-07-24 08:39:00  106.491491  16.725239  6.403124

if you want 0 instead of NaN remove min_count = 1

  • What is done here is to calculate the module of each vector constructed from consecutive points of each inspector

Upvotes: 1

Related Questions