vincek
vincek

Reputation: 15

how to find the Min value and index in an R data frame and restructure it into a Tidy data frame

I am using R, and have built a matrix to calculate the distance between n locations. I need to calculate the nearest neighbor for each location and populate a resultant matrix of Location ID, Nearest Neighbor ID, and the distance.

Here is the data.frame (note application code uses a function to calculate the distance from Long/Lat, below is the resultant data frame.) I also assign the distance to itself as 999 so it won't get chosen as the nearest neighbor.

distdf <- data.frame("New York" = c(0, 713, 2451, 748), "Chicago" = c(713, 0, 1745, 587), "Los Angeles" = c(2451,1745, 0, 1936), "Atlanta" = c(748, 587, 1936, 0), row.names = c("New York", "Chicago", "Los Angeles", "Atlanta"))
 
distdf[distdf ==0]<- 9999

From here, I want to find the minimum distance and the row which has that value. So the result would look like this:

result<- data.frame("NearNeigh" = c("Chicago", "Atl", "Atl", "Chic"), "Dist" =c(713, 587, 1936, 587), row.names = c("New York","Chicago", "Los Angeles", "Atlanta"))

I have been able to find the nearest neighbor through something like this, but I fear I am heading down the wrong road:

l1<- apply(distdf, 2, which.min)

l1df<- as.data.frame(l1)

Upvotes: 0

Views: 886

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389047

l1 gives the index of minimum value in each column. To get the minimum value use min.

You can create the final dataframe as. :

l1 <- apply(distdf, 2, which.min)
l2 <- apply(distdf, 2, min)

result <- data.frame(City = names(distdf), 
                     NearNeigh = rownames(distdf)[l1], 
                     Dist = l2, row.names = NULL)

result
#         City NearNeigh Dist
#1    New.York   Chicago  713
#2     Chicago   Atlanta  587
#3 Los.Angeles   Chicago 1745
#4     Atlanta   Chicago  587

Upvotes: 0

akrun
akrun

Reputation: 887223

Here is an option with max.col and pmin

data.frame(NearNeigh =  names(distdf)[max.col(-distdf, 'first')],
    Dist = do.call(pmin, distdf))

Upvotes: 0

Related Questions