Beardedant
Beardedant

Reputation: 150

How to loop st_distance through list

My goal is to apply the st_distance function to a very large data frame, yet because the data frame concerns multiple individuals, I split it using the purrr package and split function.

I have seen the use of 'lists' and 'forloops' in the past but I have no experience with these.

Below is a fraction of my dataset, I have split the dataframe by ID, into a list with 43 elements.

The st_distance function I plan to use looks something like, it it would be applied to the full data frame, not split into a list:

PART 2:

I want to do the same as explained by Dave2e, but now for geosphere::bearing I have attached long and lat in wgs84 to the initial data frame, which now looks like this:

ID         Date        Time         Datetime      Long        Lat   x            y
10_17   4/18/2017   15:02:00    4/18/2017 15:02 379800.5    5181001 -91.72272   46.35156
10_17   4/20/2017   6:00:00     4/20/2017 6:00  383409      5179885 -91.7044    46.34891
10_17   4/21/2017   21:02:00    4/21/2017 21:02 383191.2    5177960 -91.72297   46.35134
10_24   4/22/2017   10:03:00    4/22/2017 10:03 383448.6    5179918 -91.72298   46.35134
10_17   4/23/2017   12:01:00    4/23/2017 12:01 378582.5    5182110 -91.7242    46.34506
10_24   4/24/2017   1:00:00     4/24/2017 1:00  383647.4    5180009 -91.72515   46.34738
10_24   4/25/2017   16:01:00    4/25/2017 16:01 383407.9    5179872 -91.7184    46.32236
10_17   4/26/2017   18:02:00    4/26/2017 18:02 380691.9    5179353 -91.65361   46.34712
10_36   4/27/2017   20:00:00    4/27/2017 20:00 382521.9    5175266 -91.66127   46.3485
10_36   4/29/2017   11:01:00    4/29/2017 11:01 383443.8    5179909 -91.70303   46.35451
10_36   4/30/2017   0:00:00     4/30/2017 0:00  383060.8    5178361 -91.6685    46.32941
10_40   4/30/2017   13:02:00    4/30/2017 13:02 383426.3    5179873 -91.70263   46.35481
10_40   5/2/2017    17:02:00    5/2/2017 17:02  383393.7    5179883 -91.67099   46.34138
10_40   5/3/2017    6:01:00     5/3/2017 6:01   382875.8    5179376 -91.66324   46.34763
10_88   5/3/2017    19:02:00    5/3/2017 19:02  383264.3    5179948 -91.73075   46.3684
10_88   5/4/2017    8:01:00     5/4/2017 8:01   378554.4    5181966 -91.70413   46.35429
10_88   5/4/2017    21:03:00    5/4/2017 21:03  379830.5    5177232 -91.66452   46.37274

I then try a function similar to the one below, but with the coordinates changed to x and y but it leads to an error

dis_list <- split(data, data$ID) 
 answer <- lapply(dis_list, function(df) {
  start <- df[-1 , c("x", "y")] %>% 
    st_as_sf(coords = c('x', 'y')) 
  
  end <- df[-nrow(df), c("x", "y")] %>% 
    st_as_sf(coords = c('x', 'y')) 
  
  
  angles <-geosphere::bearing(start, end) 
  
  df$angles <- c(NA, angles)
  df
})

answer

which gives the error

Error in .pointsToMatrix(p1) : 
'list' object cannot be coerced to type 'double'

Upvotes: 1

Views: 405

Answers (1)

Dave2e
Dave2e

Reputation: 24069

Here is an basic solution. I split the original data into multiple data frames using split and then wrapped the distance function in lapply().

data <- read.table(header=TRUE, text="ID      Date        Time        Datetime  time2      Long        Lat
10_17   4/18/2017   15:02:00    4/18/2017 15:02 379800.5    5181001
10_17   4/20/2017   6:00:00     4/20/2017 6:00  383409      5179885
10_17   4/21/2017   21:02:00    4/21/2017 21:02 383191.2    5177960
10_24   4/22/2017   10:03:00    4/22/2017 10:03 383448.6    5179918
10_17   4/23/2017   12:01:00    4/23/2017 12:01 378582.5    5182110
10_24   4/24/2017   1:00:00     4/24/2017 1:00  383647.4    5180009
10_24   4/25/2017   16:01:00    4/25/2017 16:01 383407.9    5179872
10_17   4/26/2017   18:02:00    4/26/2017 18:02 380691.9    5179353
10_36   4/27/2017   20:00:00    4/27/2017 20:00 382521.9    5175266
10_36   4/29/2017   11:01:00    4/29/2017 11:01 383443.8    5179909
10_36   4/30/2017   0:00:00     4/30/2017 0:00  383060.8    5178361
10_40   4/30/2017   13:02:00    4/30/2017 13:02 383426.3    5179873
10_40   5/2/2017    17:02:00    5/2/2017 17:02  383393.7    5179883
10_40   5/3/2017    6:01:00     5/3/2017 6:01   382875.8    5179376
10_88   5/3/2017    19:02:00    5/3/2017 19:02  383264.3    5179948
10_88   5/4/2017    8:01:00     5/4/2017 8:01   378554.4    5181966
10_88   5/4/2017    21:03:00    5/4/2017 21:03  379830.5    5177232")

#EPSG:32615 32615
library(sf)
library(magrittr)

dfs <- split(data, data$ID) 

answer <- lapply(dfs, function(df) {
   #convert to a sf oject and specify coordinate systems
   start <- df[-1 , c("Long", "Lat")] %>% 
      st_as_sf(coords = c('Long', 'Lat')) %>%
      st_set_crs(32615)
   
   end <- df[-nrow(df), c("Long", "Lat")] %>% 
      st_as_sf(coords = c('Long', 'Lat')) %>%
      st_set_crs(32615)
   
   #long_lat <-st_transform(start, 4326)
   distances <-sf::st_distance(start, end, by_element = TRUE) 
   
   df$distances <- c(NA, distances)
   df
})

answer
$`10_17`
     ID      Date     Time  Datetime time2     Long     Lat distances
1 10_17 4/18/2017 15:02:00 4/18/2017 15:02 379800.5 5181001     NA
2 10_17 4/20/2017  6:00:00 4/20/2017  6:00 383409.0 5179885  3777.132
3 10_17 4/21/2017 21:02:00 4/21/2017 21:02 383191.2 5177960  1937.282
5 10_17 4/23/2017 12:01:00 4/23/2017 12:01 378582.5 5182110  6201.824
8 10_17 4/26/2017 18:02:00 4/26/2017 18:02 380691.9 5179353  3471.400

$`10_24`
     ID      Date     Time  Datetime time2     Long     Lat distances
4 10_24 4/22/2017 10:03:00 4/22/2017 10:03 383448.6 5179918    NA
6 10_24 4/24/2017  1:00:00 4/24/2017  1:00 383647.4 5180009  218.6377
7 10_24 4/25/2017 16:01:00 4/25/2017 16:01 383407.9 5179872  275.9153

There should be an easier way to calculate distances between rows instead of creating 2 series of points.

Referenced: Converting table columns to spatial objects

Upvotes: 1

Related Questions