wake_wake
wake_wake

Reputation: 1204

R - Merging city name to approximate lat-long coordinates

I want to merge city names to approximate coordinates.

I have two datasets.

  1. lat-long for cities, called cities.
  2. lat-long for observed events, called events.

Most of the events occur just out-side the lat-longs of the city.

I want to merge in the city from cities if the lat-long are max 1 lat and lon different from those listed in events.

The nearest function in data.table seems to be too crude.

What would you do? Use maptools?

Example:

cities <- data.table(city = c("A", "B", "C"),
                 lat = c(23.4, 43.5, 21.3),
                 lon = c(100, 98.4, -78.2))

events <- data.table(event = c("X1", "Y1", "B1"),
                 lat = c(24.4, 42.5, 23.3),
                 lon = c(101, 100.4, -78.2)))

result <- data.table(event = c("X1", "Y1", "B1"),
                 lat = c(23.4, 43.5, 21.3),
                 lon = c(100, 98.4, -78.2),
                 city = c("A", NA, NA))

> result
   event  lat   lon city
1:    X1 23.4 100.0    A
2:    Y1 43.5  98.4 <NA>
3:    B1 21.3 -78.2 <NA>

Upvotes: 0

Views: 152

Answers (1)

Wimpel
Wimpel

Reputation: 27802

method 1: non-equi join

This non-equi update join do the trick... But this only will work since you put on a hard 1-degree limit. Problem is dat the distance bewteen 2 degrees will vary around the globe...

events[ cities[, `:=`(lat_min = lat - 1, lat_max = lat+1,
                      lon_min = lon - 1, lon_max = lon + 1) ], 
        city := i.city, 
        on = .(lat >= lat_min, lat <= lat_max, lon >= lon_min, lon <= lon_max ) ][]

#    event  lat   lon city
# 1:    X1 24.4 101.0    A
# 2:    Y1 42.5 100.4 <NA>
# 3:    B1 23.3 -78.2 <NA>

method 2: based on absolute distance

If you want to set a maximum distance bwetween events and cities, you'll need a spatial solution like this:

#maximum distance between event and city (in metres)
max_dist = 180000

library( sf )
#create simple (point) features of events and cities
cities.sf <- st_as_sf( cities, coords = c("lon", "lat"), crs = 4326 )
events.sf <- st_as_sf( events, coords = c("lon", "lat"), crs = 4326 )

#spatial join
st_join( events.sf, cities.sf, join = st_is_within_distance, dist = max_dist )

# Simple feature collection with 3 features and 2 fields
# geometry type:  POINT
# dimension:      XY
# bbox:           xmin: -78.2 ymin: 23.3 xmax: 101 ymax: 42.5
# CRS:            EPSG:4326
#   event city           geometry
# 1    X1    A   POINT (101 24.4)
# 2    Y1 <NA> POINT (100.4 42.5)
# 3    B1 <NA> POINT (-78.2 23.3)

Upvotes: 4

Related Questions