Reputation: 11
I have telemetry data with a large number of detections for each individual (ID) at several stations. I want to calculate the total distance travelled by each ID in my study.
Each station is associated with its coordinates (Longitude and Latitude), which I convert to UTM data.
I have created a data that looks like mine, with the same data type (data_detections).
library(lubridate) # ymd_hms()
library(sp)
#generate a random data
generate_random_datetime <- function(start_date, end_date, n) {
seq(start_date, end_date, by = "min")[sample(1:(as.integer(difftime(end_date, start_date, units = "mins")) + 1), n)]
}
set.seed(123)
n <- 100
data_detections <- data.frame(
Date.and.Time..UTC. = generate_random_datetime(ymd_hms("2024-01-01 00:00:00"), ymd_hms("2024-01-20 23:59:59"), n),
Receiver = sample(1:10, n, replace = TRUE),
Latitude = runif(n, 52.0, 53.0),
Longitude = runif(n, 3.0, 4.0) ,
ID = as.character(sample(1:10, n, replace = TRUE))
)
I calculated my total distances like that, but the values are not correct when i compare it with my "real" values
#convert to UTM
coord <- SpatialPoints(data_detections[, c("Longitude", "Latitude")],
proj4string = CRS("+proj=longlat +datum=WGS84"))
coord.t <- spTransform(coord, CRS("+proj=utm +datum=WGS84 +zone=43"))
data_detections[, c("Longitude_UTM", "Latitude_UTM")] <- coordinates(coord.t)
head(data_detections)
#caclulate the total distance
total_distance <- function(data) {
coords <- as.matrix(data[, c("Longitude_UTM", "Latitude_UTM")])
distances <- sqrt(rowSums((coords[-1, ] - coords[-nrow(coords), ])^2))
distance <- sum(distances, na.rm = TRUE)
return(data.frame(distance = distance))
}
total_distances <- data_detections %>%
group_by(ID) %>%
group_modify(~ total_distance(.x))
print(total_distances)
Upvotes: 0
Views: 101
Reputation: 17554
If you just need to add up distances between grouped and ordered(*) geographic coordinates, one option is to use geosphere::distGeo()
, if you pass it a single 2 column data.frame or matrix, it returns a vector of sequential distances between points on an ellipsoid (default is WGS84), in meters.
* - in your example data you have created a random timestamp sequence, without sorting it first, it would simulate a random point order in your telemetry data, i.e. random distances with random sums.
library(dplyr)
library(lubridate)
library(geosphere)
dist_geosphere <-
data_detections |>
group_by(ID) |>
arrange(Date.and.Time..UTC., .by_group = TRUE) |>
mutate(dist_geo = distGeo(pick(c("Longitude", "Latitude")))) |>
summarise(dist_sum_m = sum(dist_geo, na.rm = TRUE))
dist_geosphere
#> # A tibble: 10 × 2
#> ID dist_sum_m
#> <chr> <dbl>
#> 1 1 361722.
#> 2 10 556821.
#> 3 2 347572.
#> 4 3 293461.
#> 5 4 470681.
#> 6 5 303046.
#> 7 6 333351.
#> 8 7 556807.
#> 9 8 716994.
#> 10 9 408434.
Example data:
#generate a random data
generate_random_datetime <- function(start_date, end_date, n) {
seq(start_date, end_date, by = "min")[sample(1:(as.integer(difftime(end_date, start_date, units = "mins")) + 1), n)]
}
set.seed(123)
n <- 100
data_detections <- data.frame(
Date.and.Time..UTC. = generate_random_datetime(ymd_hms("2024-01-01 00:00:00"), ymd_hms("2024-01-20 23:59:59"), n),
Receiver = sample(1:10, n, replace = TRUE),
Latitude = runif(n, 52.0, 53.0),
Longitude = runif(n, 3.0, 4.0) ,
ID = as.character(sample(1:10, n, replace = TRUE))
)
Upvotes: 0