Reputation: 150
My goal is to apply the st_distance function to a very large data frame, yet because the data frame concerns multiple individuals, I split it using the purrr package and split function.
I have seen the use of 'lists' and 'forloops' in the past but I have no experience with these.
Below is a fraction of my dataset, I have split the dataframe by ID, into a list with 43 elements.
The st_distance function I plan to use looks something like, it it would be applied to the full data frame, not split into a list:
PART 2:
I want to do the same as explained by Dave2e, but now for geosphere::bearing I have attached long and lat in wgs84 to the initial data frame, which now looks like this:
ID Date Time Datetime Long Lat x y
10_17 4/18/2017 15:02:00 4/18/2017 15:02 379800.5 5181001 -91.72272 46.35156
10_17 4/20/2017 6:00:00 4/20/2017 6:00 383409 5179885 -91.7044 46.34891
10_17 4/21/2017 21:02:00 4/21/2017 21:02 383191.2 5177960 -91.72297 46.35134
10_24 4/22/2017 10:03:00 4/22/2017 10:03 383448.6 5179918 -91.72298 46.35134
10_17 4/23/2017 12:01:00 4/23/2017 12:01 378582.5 5182110 -91.7242 46.34506
10_24 4/24/2017 1:00:00 4/24/2017 1:00 383647.4 5180009 -91.72515 46.34738
10_24 4/25/2017 16:01:00 4/25/2017 16:01 383407.9 5179872 -91.7184 46.32236
10_17 4/26/2017 18:02:00 4/26/2017 18:02 380691.9 5179353 -91.65361 46.34712
10_36 4/27/2017 20:00:00 4/27/2017 20:00 382521.9 5175266 -91.66127 46.3485
10_36 4/29/2017 11:01:00 4/29/2017 11:01 383443.8 5179909 -91.70303 46.35451
10_36 4/30/2017 0:00:00 4/30/2017 0:00 383060.8 5178361 -91.6685 46.32941
10_40 4/30/2017 13:02:00 4/30/2017 13:02 383426.3 5179873 -91.70263 46.35481
10_40 5/2/2017 17:02:00 5/2/2017 17:02 383393.7 5179883 -91.67099 46.34138
10_40 5/3/2017 6:01:00 5/3/2017 6:01 382875.8 5179376 -91.66324 46.34763
10_88 5/3/2017 19:02:00 5/3/2017 19:02 383264.3 5179948 -91.73075 46.3684
10_88 5/4/2017 8:01:00 5/4/2017 8:01 378554.4 5181966 -91.70413 46.35429
10_88 5/4/2017 21:03:00 5/4/2017 21:03 379830.5 5177232 -91.66452 46.37274
I then try a function similar to the one below, but with the coordinates changed to x and y but it leads to an error
dis_list <- split(data, data$ID)
answer <- lapply(dis_list, function(df) {
start <- df[-1 , c("x", "y")] %>%
st_as_sf(coords = c('x', 'y'))
end <- df[-nrow(df), c("x", "y")] %>%
st_as_sf(coords = c('x', 'y'))
angles <-geosphere::bearing(start, end)
df$angles <- c(NA, angles)
df
})
answer
which gives the error
Error in .pointsToMatrix(p1) :
'list' object cannot be coerced to type 'double'
Upvotes: 1
Views: 405
Reputation: 24069
Here is an basic solution. I split the original data into multiple data frames using split and then wrapped the distance function in lapply()
.
data <- read.table(header=TRUE, text="ID Date Time Datetime time2 Long Lat
10_17 4/18/2017 15:02:00 4/18/2017 15:02 379800.5 5181001
10_17 4/20/2017 6:00:00 4/20/2017 6:00 383409 5179885
10_17 4/21/2017 21:02:00 4/21/2017 21:02 383191.2 5177960
10_24 4/22/2017 10:03:00 4/22/2017 10:03 383448.6 5179918
10_17 4/23/2017 12:01:00 4/23/2017 12:01 378582.5 5182110
10_24 4/24/2017 1:00:00 4/24/2017 1:00 383647.4 5180009
10_24 4/25/2017 16:01:00 4/25/2017 16:01 383407.9 5179872
10_17 4/26/2017 18:02:00 4/26/2017 18:02 380691.9 5179353
10_36 4/27/2017 20:00:00 4/27/2017 20:00 382521.9 5175266
10_36 4/29/2017 11:01:00 4/29/2017 11:01 383443.8 5179909
10_36 4/30/2017 0:00:00 4/30/2017 0:00 383060.8 5178361
10_40 4/30/2017 13:02:00 4/30/2017 13:02 383426.3 5179873
10_40 5/2/2017 17:02:00 5/2/2017 17:02 383393.7 5179883
10_40 5/3/2017 6:01:00 5/3/2017 6:01 382875.8 5179376
10_88 5/3/2017 19:02:00 5/3/2017 19:02 383264.3 5179948
10_88 5/4/2017 8:01:00 5/4/2017 8:01 378554.4 5181966
10_88 5/4/2017 21:03:00 5/4/2017 21:03 379830.5 5177232")
#EPSG:32615 32615
library(sf)
library(magrittr)
dfs <- split(data, data$ID)
answer <- lapply(dfs, function(df) {
#convert to a sf oject and specify coordinate systems
start <- df[-1 , c("Long", "Lat")] %>%
st_as_sf(coords = c('Long', 'Lat')) %>%
st_set_crs(32615)
end <- df[-nrow(df), c("Long", "Lat")] %>%
st_as_sf(coords = c('Long', 'Lat')) %>%
st_set_crs(32615)
#long_lat <-st_transform(start, 4326)
distances <-sf::st_distance(start, end, by_element = TRUE)
df$distances <- c(NA, distances)
df
})
answer
$`10_17`
ID Date Time Datetime time2 Long Lat distances
1 10_17 4/18/2017 15:02:00 4/18/2017 15:02 379800.5 5181001 NA
2 10_17 4/20/2017 6:00:00 4/20/2017 6:00 383409.0 5179885 3777.132
3 10_17 4/21/2017 21:02:00 4/21/2017 21:02 383191.2 5177960 1937.282
5 10_17 4/23/2017 12:01:00 4/23/2017 12:01 378582.5 5182110 6201.824
8 10_17 4/26/2017 18:02:00 4/26/2017 18:02 380691.9 5179353 3471.400
$`10_24`
ID Date Time Datetime time2 Long Lat distances
4 10_24 4/22/2017 10:03:00 4/22/2017 10:03 383448.6 5179918 NA
6 10_24 4/24/2017 1:00:00 4/24/2017 1:00 383647.4 5180009 218.6377
7 10_24 4/25/2017 16:01:00 4/25/2017 16:01 383407.9 5179872 275.9153
There should be an easier way to calculate distances between rows instead of creating 2 series of points.
Referenced: Converting table columns to spatial objects
Upvotes: 1