Reputation: 15
I am using R, and have built a matrix to calculate the distance between n locations. I need to calculate the nearest neighbor for each location and populate a resultant matrix of Location ID, Nearest Neighbor ID, and the distance.
Here is the data.frame (note application code uses a function to calculate the distance from Long/Lat, below is the resultant data frame.) I also assign the distance to itself as 999 so it won't get chosen as the nearest neighbor.
distdf <- data.frame("New York" = c(0, 713, 2451, 748), "Chicago" = c(713, 0, 1745, 587), "Los Angeles" = c(2451,1745, 0, 1936), "Atlanta" = c(748, 587, 1936, 0), row.names = c("New York", "Chicago", "Los Angeles", "Atlanta"))
distdf[distdf ==0]<- 9999
From here, I want to find the minimum distance and the row which has that value. So the result would look like this:
result<- data.frame("NearNeigh" = c("Chicago", "Atl", "Atl", "Chic"), "Dist" =c(713, 587, 1936, 587), row.names = c("New York","Chicago", "Los Angeles", "Atlanta"))
I have been able to find the nearest neighbor through something like this, but I fear I am heading down the wrong road:
l1<- apply(distdf, 2, which.min)
l1df<- as.data.frame(l1)
Upvotes: 0
Views: 886
Reputation: 389047
l1
gives the index of minimum value in each column. To get the minimum value use min
.
You can create the final dataframe as. :
l1 <- apply(distdf, 2, which.min)
l2 <- apply(distdf, 2, min)
result <- data.frame(City = names(distdf),
NearNeigh = rownames(distdf)[l1],
Dist = l2, row.names = NULL)
result
# City NearNeigh Dist
#1 New.York Chicago 713
#2 Chicago Atlanta 587
#3 Los.Angeles Chicago 1745
#4 Atlanta Chicago 587
Upvotes: 0
Reputation: 887223
Here is an option with max.col
and pmin
data.frame(NearNeigh = names(distdf)[max.col(-distdf, 'first')],
Dist = do.call(pmin, distdf))
Upvotes: 0