Petr
Petr

Reputation: 1817

Calculating distances from capital city using coordinances [R]

I have a 2 large dataframes containing various variables, I need to add variable distance_from_capital_city, that would be defined as folows:

One dataframe has all country names and capital cities and their coordinances (cap_coordinances in exaple below) and I have another dataframe that has some variables in same countries sometimes in capital city sometimes not.

I need to add variable distance_from_capital_city to real_data dataframe (in example below) and the result should look like this:

First 4 rows of variable distance_from_capital_city in dataframe real_data should be equal to zero (or some small number because coordinances do not have to match exactly and rounding error etc.) and last fifth row should contain distance from Barcelona to Matrid (grouped_by country). Distance should be measured in kilometrs from capital city, or any euclidian distance or any other suitable measuremeent.

Using for example this funtion:

library(geosphere) distm(c(lon1, lat1), c(lon2, lat2), fun = distHaversine)

I gave example of the result (numbers are for ilustration)

library(tibble)
cap_coordinances = 
  tribble(
  ~country_txt, ~city, ~longitude, ~latitude, 
  "Greece",             "Athens",       23.8,       37.9,
  "Italy",              "Rome",         12.5,       41.9,
  "Netherlands",        "Amsterdam",     4.90,      52.4,
  "Spain",              "Madrid",       -0.743,     41.0,
)

real_data = 
tribble(
  ~country_txt,        ~city,  ~longitude,  ~latitude,
                "Greece",      "Athens", 23.762728, 37.99749,
                 "Italy",        "Rome", 12.490069, 41.89096,
  "Netherlands",        "Amsterdam",     4.90,      52.4,
  "Spain",              "Madrid",       -0.743,     41.0,
           "Spain",     "Barcelona",   2.15, 41.3

)

result = 
  tribble(
    ~country_txt, ~city,  ~longitude, ~latitude, ~distance_from_capital_city,
    "Greece", "Athens", 23.762728, 37.99749, "0 or small number",
    "Italy", "Rome", 12.490069, 41.89096, "0 or small number",
    "Netherlands", "Amsterdam", 4.90, 52.4, "0 or small number",
    "Spain", "Madrid", -0.743, 41.0, "0 or small number",
    "Spain", "Barcelona", 2.15, 41.3, 3500

  )


I cannot solve this issue on my own, So I would like to ask for any advice

Data I am using are public can be downloaded here:

Upvotes: 1

Views: 332

Answers (2)

akrun
akrun

Reputation: 887213

We can do a join and then calculate the difference between the corresponding 'latitude', 'longitude' columns

library(dplyr)
library(geosphere)
real_data %>% 
    left_join(cap_coordinances, by = 'country_txt') %>% 
    transmute(country_txt, city = city.x,
     distance = pmap_dbl(.[c('longitude.x', 'latitude.x', 
   'longitude.y', 'latitude.y')], 
      ~ distm(c(..1, ..2), c(..3, ..4), fun = distHaversine) %>% as.vector)) 
# A tibble: 5 x 3
#  country_txt city      distance
#  <chr>       <chr>        <dbl>
#1 Greece      Athens      11335.
#2 Italy       Rome         1300.
#3 Netherlands Amsterdam       0 
#4 Spain       Madrid          0 
#5 Spain       Barcelona  244775.

Upvotes: 2

Jeffrey Evans
Jeffrey Evans

Reputation: 2397

Here is how you would do it using sp. The sf solution would be similar using st_distance and you could use pipes. I just find the coercion to a spatial object more straight forward with sp. Do note that since your data is in decimal degrees distance is based on great circle distance and is in Kilometers.

library(tibble)
library(sp)

cap_coordinances = 
  tribble(
  ~country_txt, ~city, ~longitude, ~latitude, 
  "Greece",             "Athens",       23.8,       37.9,
  "Italy",              "Rome",         12.5,       41.9,
  "Netherlands",        "Amsterdam",     4.90,      52.4,
  "Spain",              "Madrid",       -0.743,     41.0,
)

real_data = 
tribble(
  ~country_txt,        ~city,  ~longitude,  ~latitude,
   "Greece",           "Athens", 23.762728, 37.99749,
   "Italy",            "Rome", 12.490069, 41.89096,
   "Netherlands",      "Amsterdam",     4.90,      52.4,
   "Spain",            "Madrid",       -0.743,     41.0,
   "Spain",            "Barcelona",   2.15, 41.3
)

coordinates(cap_coordinances) <- ~longitude+latitude
coordinates(real_data) <- ~longitude+latitude

d <- spDists(real_data, cap_coordinances, longlat = TRUE) 
  rownames(d) <- real_data$city
  colnames(d) <- cap_coordinances$city
print(d)

Upvotes: 0

Related Questions