Reputation: 41
I can't get the below code to work. I'm trying to calculate the distances between all possible combinations of lats and longs in my data set.
Sample input data I'll use:
p <- data.frame(lat=runif(6,-90,90), lon=runif(6,-180,180) );
I can't get the below code to work. The distance function doesn't work, so I tried distm
, but that also gave me an error message. The error message is listed below the code.
d <- setNames(do.call(rbind.data.frame,
combn(1:nrow(p), 2, simplify = FALSE)),
c('p1','p2'));
d$dist <- sapply(1:nrow(d), function(r){
distance(p$lat[d$p1[r]], p$lat[d$p2[r]], p$lon[d$p1[r]], p$lon[d$p2[r]])
})
d$dist <- sapply(1:nrow(d), function(r){
distm(p$lat[d$p1[r]], p$lat[d$p2[r]], p$lon[d$p1[r]], p$lon[d$p2[r]])
})
#> Error in distm(p$lat[d$p1[r]], p$lat[d$p2[r]], p$lon[d$p1[r]], p$lon[d$p2[r]]) :
#> unused argument (p$lon[d$p2[r]])
Upvotes: 4
Views: 909
Reputation: 43334
geosphere::distHaversine
(and most of the other distance functions) is vectorized, so you can call it on all the pairs at once. Putting it all into a nice data.frame,
p <- data.frame(lat = runif(6, -90, 90),
lon = runif(6, -180, 180))
# get row indices of pairs
row_pairs <- combn(nrow(p), 2)
# make data.frame of pairs
df_dist <- cbind(x = p[row_pairs[1,],],
y = p[row_pairs[2,],])
# add distance column by calling distHaversine (vectorized) on each pair
df_dist$dist <- geosphere::distHaversine(df_dist[2:1], df_dist[4:3])
df_dist
#> x.lat x.lon y.lat y.lon dist
#> 1 -10.281070 -156.30519 -7.027720 -104.76897 5677699
#> 1.1 -10.281070 -156.30519 -51.142344 -100.99517 6750255
#> 1.2 -10.281070 -156.30519 -3.979805 -141.43436 1785251
#> 1.3 -10.281070 -156.30519 -21.239130 -65.97719 9639637
#> 1.4 -10.281070 -156.30519 66.292704 -154.52851 8525401
#> 2 -7.027720 -104.76897 -51.142344 -100.99517 4923176
#> 2.1 -7.027720 -104.76897 -3.979805 -141.43436 4075742
#> 2.2 -7.027720 -104.76897 -21.239130 -65.97719 4459657
#> 2.3 -7.027720 -104.76897 66.292704 -154.52851 9085777
#> 3 -51.142344 -100.99517 -3.979805 -141.43436 6452943
#> 3.1 -51.142344 -100.99517 -21.239130 -65.97719 4502520
#> 3.2 -51.142344 -100.99517 66.292704 -154.52851 13833468
#> 4 -3.979805 -141.43436 -21.239130 -65.97719 8350236
#> 4.1 -3.979805 -141.43436 66.292704 -154.52851 7893225
#> 5 -21.239130 -65.97719 66.292704 -154.52851 12111227
Alternatively, you can use geosphere::distm
, which gives you a distance matrix, which contains the same data in a different format:
geosphere::distm(p[, 2:1])
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0 5677699 6750255 1785251 9639637 8525401
#> [2,] 5677699 0 4923176 4075742 4459657 9085777
#> [3,] 6750255 4923176 0 6452943 4502520 13833468
#> [4,] 1785251 4075742 6452943 0 8350236 7893225
#> [5,] 9639637 4459657 4502520 8350236 0 12111227
#> [6,] 8525401 9085777 13833468 7893225 12111227 0
As described by ?distHaversine
, distances are in meters. Convert as you like. Also note that geosphere's functions take lon/lat, not lat/lon, so the columns need to be reversed to work.
Upvotes: 8