Carlo Soares
Carlo Soares

Reputation: 63

Relate lat and long of two dataframes where coordinates are not exactly the same (shortest distance)

See I have two dataframes, both with lat and long. My idea is to compare lat and long of both dataframes. If you see, for example, we have for Market1 of dataframe x, latitude equal to -22.89290 and longitude equal to -48.45048, which is similar (not equal) to the third line of dataframe y, which has a latitude of -22.89287 and a longitude of -48.45075. You can see that this happens in the other Markets. Note that the first 5 numbers are the same, in this case it was -22.892 for lat and -48.450 for long. Therefore, if they are the same, insert the names of the Marketname column of the dataframe x into the Marketname column of the dataframe y, as shown in Expected output.

x<-structure(list(Latitude = c(-22.8928950233225, -22.929618895716, 
-22.8773751675936), Longitude = c(-48.4504779883472, -48.4515412645053, 
-48.4364903609011), Marketname = c("Market1", "Market2", 
"Market3")), row.names = c(NA, 3L), class = "data.frame")

> x
   Latitude Longitude Marketname
1 -22.89290 -48.45048    Market1
2 -22.92962 -48.45154    Market2
3 -22.87738 -48.43649    Market3


y<- structure(list(lat = c(-22.8715263, -22.8774825, -22.8928723, 
-22.9295906), lng = c(-48.440335, -48.4364964, -48.4507477, -48.4516264
), Marketname = c("Market0","0", "0", "0")), row.names = c(NA, 4L), class = "data.frame")

 > y   
        lat       lng Marketname
1 -22.87153 -48.44033    Market0
2 -22.87748 -48.43650          0
3 -22.89287 -48.45075          0
4 -22.92959 -48.45163          0

Expected output

 > y   
            lat       lng Marketname
    1 -22.87153 -48.44033    Market0
    2 -22.87748 -48.43650    Market3
    3 -22.89287 -48.45075    Market1      
    4 -22.92959 -48.45163    Market2      

Upvotes: 1

Views: 98

Answers (1)

jay.sf
jay.sf

Reputation: 72838

Since you seem to need the shortest distance to each market based on geocoordinates on an ellipsoid, you should use something like geosphere::distGeo. To compare each coordinat of y with each of x we may Vectorize the function and use an outer approach. Note, that we don't want the first column, so we remove first element of the sequence [1,...n] seq_len(nrow(y))[-1].

f <- Vectorize(\(i, j) geosphere::distGeo(y[i, 2:1], x[j, 2:1]))
u <- outer(seq_len(nrow(y))[-1], seq_len(nrow(x)), f) |> apply(1, which.min)
u
# [1] 3 1 2

Insert into x like so:

y[-1, 3] <- x[u, 3]
y
#         lat       lng Marketname
# 1 -22.87153 -48.44033    Market0
# 2 -22.87748 -48.43650    Market3
# 3 -22.89287 -48.45075    Market1
# 4 -22.92959 -48.45163    Market2

Upvotes: 1

Related Questions