Puneet Matai
Puneet Matai

Reputation: 11

How to transform a nested for-loop operation to a more efficient code in R

I am a dilettante when it comes to R coding. I am trying to run the following code for one of the tasks. My basic purpose is to count the number of attractions within the proximity of 2kms of a specific location, both attractions, and the locations are specified by respective longitude and latitude. The number of records in the main data set is around 29K and while the number of attractions is 28. How can I convert the following code in a better performing R code instead (the current one is really crude and not at all a good practice)

for(i in 1:nrow(mainData)) {
  attr_count[i] = 0  
  loc_coord = c(mainData$longitude[i],mainData$latitude[i])
  for(j in 1:nrow(ny_attractions)) {
    attr_coord = c(ny_attractions$lon[j],ny_attractions$lat[j])
    dist = distVincentySphere(attr_coord,loc_coord)
    if(dist <= 2000) {
      attr_count[i] = attr_count[i] + 1
    } 
  }
}

[EDIT]: My apologies for not putting it clearly earlier. Here's an example of what I am trying to achieve. I have 2 data sets -

Dataset - 1 (NYC_attractions) (27 records)

enter image description here

Dataset-2 (master data for house listings) (29K records)

enter image description here

Now, I need to add one more column (num_of_attractions) in Dataset-2, representing the number of attractions within 2Kms of the specified listing (i.e. per record in data set-2)

Hope, this explains the problem

Thanks

Upvotes: 0

Views: 46

Answers (1)

Billy34
Billy34

Reputation: 2214

Hello your question is partly answered here https://stackoverflow.com/a/49860968/3042154. As you use geodetic coordinates (lat/lon) instead of projected coordinates (meters) it can be done in to steps. First roughly select potential neighbours using euclidian distance using given answer then refine the selection by using your distance

Upvotes: 1

Related Questions