Calculate the length of segments within a transect

I have a transect data with latitude, longitude and substrate types. Below I provide a script that creates a hypothetical data with 3 substrate types along a straight transect starting at longitude -24.5 and ending at -23.2. Within this transect there are 3 substrate types (a,b and c), but substrate type "a" occurs 4 times and substrate type "b" twice. I would like to calculate the total length (meters) of each "a","b" and "c" substrate type segments in the transect. As an example, the substrate segment "a" ends at the position of the first observation of "b" substrate type and the substrate segment c ends where the fourth "a" substrate type segment starts. I would like the length of. I have looked into the sp and Rdistance packages but I´m really stuck. With thanks in advance.

hypothetical example: each box denote each segment for which I would like to calculate the length of hypothetical example: each box denote each segment for which I would like to calculate the length of

Alon<-c(-23.20, -23.30,-23.40,-24.10,-24.15, -23.95, -23.70, -23.60,-    24.20, -24.25)  
Blon<-c(-23.80, -23.85, -24.00, -24.03, -24.06)
Clon<-c(-23.47, -23.50,-23.55) 
Alat<-c(64,64,64,64,64, 64, 64, 64,64, 64)
Blat<-c(64,64, 64, 64,64)
Clat<-c(64,64, 64)
A<-as.data.frame(cbind(Alon, Alat))
B<-as.data.frame(cbind(Blon, Blat))
C<-as.data.frame(cbind(Clon, Clat))
plot(A$Alon, A$Alat, pch=97)
points(B$Blon, B$Blat, col="red", pch=98)
points(C$Clon, C$Clat, col="blue", pch=99)


A$ID<-seq.int(nrow(A))
A[,3]<-"A"
B$ID<-seq.int(nrow(B))
B[,3]<-"B"
C$ID<-seq.int(nrow(C))
C[,3]<-"C"


colnames(A)<-c("lon", "lat", "ID")
colnames(B)<-c("lon", "lat", "ID")
colnames(C)<-c("lon", "lat", "ID")

A<-as.data.frame(A)
B<-as.data.frame(B)
C<-as.data.frame(C)

pos<- rbind(A,B,C)
pos<-pos[,c("ID","lon","lat")]

Upvotes: 1

Views: 151

Answers (1)

Dan
Dan

Reputation: 12074

I suspect the length in metres depends on your projection, so here I calculate the length in degrees and will leave the conversion up to you. First, I order by longitude (as your latitudes are all the same).

# Order data frame
pos <- pos[order(pos$lon),]
  

Next, I use rle to pull out runs of each ID. I add 1 to start the first run on the first element and use pmin to make sure the final index isn't greater than the the number of rows in the data frame.

# Pull out start and end points of segments
df_seg <- pos[pmin(nrow(pos), c(1, cumsum(rle(pos$ID)$lengths) + 1)),]

Finally, I use diff to calculate the difference between the start and end longitudes of each run.

# Calculate difference in longitude
data.frame(ID = df_seg$ID[1:(nrow(df_seg)-1)], diff_lon = abs(diff(df_seg$lon)))

# Check data frame
#   ID diff_lon
# 1  A     0.19
# 2  B     0.11
# 3  A     0.10
# 4  B     0.15
# 5  A     0.15
# 6  C     0.15
# 7  A     0.20

Regarding ordering stations

I wish I had a good solution to this, but I don't. So, I'll apologise before I do some terrible things...

library(dplyr)
library(RANN)

# Temporary data frame
df_stations <- pos 

# Function for finding order of stations
station_order <- function(){
  # If only one row, return it (i.e., it's the final station)
  if(nrow(df_stations) == 1)return(df_station)
  # Find the nearest neighbour for the first station
  r <- nn2(data = df_stations %>% select(lon, lat), k = 2)$nn.idx[1,2]
  # Bump the nearest neighbour to first in the data frame
  # This also deletes the first entry
  df_stations[1, ] <<- df_stations[r, ]
  # Drop the nearest neighbour elsewhere in the data frame
  df_stations <<- df_stations %>%  distinct
  # Return the nearest neighbour
  return(df_stations[1, ])
}

# Initialise data frame
res <- df_stations[1,]

# Loop over data frame
for(i in 2:nrow(df_stations))res[i, ] <- station_order()

This code orders your stations using nearest neighbour (i.e., nn2 from RANN). You'll notice that the transect is inverted, but you can always change it with res[nrow(res):1, ].

#    ID    lon lat
# 1   A -23.20  64
# 2   A -23.30  64
# 3   A -23.40  64
# 4   C -23.47  64
# 5   C -23.50  64
# 6   C -23.55  64
# 7   A -23.60  64
# 8   A -23.70  64
# 9   B -23.80  64
# 10  B -23.85  64
# 11  A -23.95  64
# 12  B -24.00  64
# 13  B -24.03  64
# 14  B -24.06  64
# 15  A -24.10  64
# 16  A -24.15  64
# 17  A -24.20  64
# 18  A -24.25  64

Upvotes: 1

Related Questions