Reputation: 389
I have a large dataset of GPS locations (480,476) and a large shapefile of roads (lines; 561487) and I want to efficiently calculate the distance to the nearest road for each point.
If I tried using st_distance(points_sf, roads_sf) which calculates a distance matrix: the Euclidean distance between every point in points_sf (1:480476) and every line in roads_sf (1:561487) which was too big. Thus, I split it up into smaller chunks by ID which was more manageable. Even still it took a long time (> 1 day) and took up a lot of memory so that I couldn't run anything else on the computer or risked crashing it (even with 64 GB of RAM).
Is there a more computationally efficient way to do this sort of calculation in R?
Upvotes: 0
Views: 146
Reputation: 389
What I have so far is to use st_nearest_feature() to identify the nearest line to each point and then calculate the distance between the point and its nearest line.
This was my solution:
near_rd = st_nearest_feature(points_sf, roads_sf)
dist_rd = sapply(1:length(near_rd), function(x)
st_distance(points_sf[x,], roads_ms[near_rd[x],]))
Upvotes: 0