Reputation: 663
I am trying to merge the two dataframes (main and sub). I want the 'variable' data from 'sub' to be merged to 'main' based on distance or better yet, whichever 'sub' row/site is closest to the 'main' row/site.
library(sf)
a <- structure(list(`Site#` = c("Site1", "Site2", "Site3", "Site4", "Site5", "Site6"), Longitude = c(-94.609, -98.1391, -99.033, -98.49, -96.4309, -95.99), `Latitude` = c(38.922, 37.486111, 37.811, 38.364, 39.4402, 39.901)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
main <- st_as_sf(a, coords = c("Longitude", "Latitude"), crs = 4326)
b <- structure(list(Longitude = c(-98.49567, -96.22451, -98.49567, -98.941391, -95.91411, -99.031113), `Latitude` = c(38.31264,39.97692, 38.31264, 37.486111, 39.92143, 37.814171), Variable = c(400, 50, 100, 201, 99, 700)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
sub <- st_as_sf(b, coords = c("Longitude", "Latitude"), crs = 4326)
c <- st_intersection(main,sub)
c <- st_is_within_distance(main,sub,dist=0.001)
I believe that the st_intersection is what I want, but if I could do it one-to-one based on distance, that would make it work. Does anyone know what could provide the result I am looking for?
Upvotes: 5
Views: 14018
Reputation: 5499
st_join()
allows for joining in a single step:
st_join(main, sub, join = st_nearest_feature, left = T)
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> Simple feature collection with 6 features and 2 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -99.033 ymin: 37.48611 xmax: -94.609 ymax: 39.901
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 6 x 3
#> `Site#` geometry Variable
#> <chr> <POINT [°]> <dbl>
#> 1 Site1 (-94.609 38.922) 99
#> 2 Site2 (-98.1391 37.48611) 201
#> 3 Site3 (-99.033 37.811) 700
#> 4 Site4 (-98.49 38.364) 400
#> 5 Site5 (-96.4309 39.4402) 50
#> 6 Site6 (-95.99 39.901) 99
Created on 2020-01-19 by the reprex package (v0.3.0)
Upvotes: 9
Reputation: 23574
This is what I tried. It seems that you need st_nearest_feature()
, which gets index of nearest feature. Once you have indices, you add them to main
. You also add row numbers (the indices) to b
. Then, you want to handle join.
library(dplyr)
library(sf)
# Which feature in y is closest to each feature in x?
# You get row index
st_nearest_feature(x = main, y = sub)
# Add the index number to main.
mutate(main, ind = st_nearest_feature(x = main, y = sub)) -> main
# Add row numbers (index) to b
mutate(b, ind = 1:n()) -> b
left_join(main, b, by = "ind")
# `Site#` geometry ind Longitude Latitude Variable
# <chr> <POINT [°]> <int> <dbl> <dbl> <dbl>
#1 Site1 (-94.609 38.922) 5 -95.9 39.9 99
#2 Site2 (-98.1391 37.48611) 4 -98.9 37.5 201
#3 Site3 (-99.033 37.811) 6 -99.0 37.8 700
#4 Site4 (-98.49 38.364) 1 -98.5 38.3 400
#5 Site5 (-96.4309 39.4402) 2 -96.2 40.0 50
#6 Site6 (-95.99 39.901) 5 -95.9 39.9 99
Upvotes: 2