ForeverYang
ForeverYang

Reputation: 91

Closest pair in R programming

if I define the following:

X<- sample(200:1000,10)
Y<- sample (200:1000, 10)
plot(X,Y)

then there will be 10 random points created, so the question is how can I find the closest pair/ shortest path??

Upvotes: 1

Views: 1315

Answers (2)

bjoseph
bjoseph

Reputation: 2166

You can use the dist() function to find the distance between each pair of points:

set.seed(1)
X<- sample(200:1000,10)
Y<- sample (200:1000, 10)
dat<-data.frame(X,Y)
print(dat)

     X   Y
1  412 364
2  497 341
3  657 748
4  924 506
5  360 813
6  915 596
7  951 770
8  724 987
9  698 501
10 248 815

 dist(dat)
           1         2         3         4         5         6         7         8         9
2   88.05680                                                                                
3  455.50082 437.32025                                                                      
4  531.32664 457.77068 360.35122                                                            
5  452.00111 491.48042 304.02960 642.14095                                                  
6  553.92509 489.64171 299.44616  90.44888 595.91442                                        
7  674.80145 624.62549 294.82198 265.37709 592.56223 177.68511                              
8  696.75893 684.72257 248.21362 520.92322 403.45012 435.15744 314.03503                    
9  317.11985 256.90660 250.37971 226.05530 459.98696 236.88394 369.28309 486.69498          
10 479.89270 535.42226 414.45144 743.27451 112.01786 702.03276 704.43878 506.12251 548.72215

Where the position 1,2 is the distance between position 1 (412,364) and position 2 (497,341).

The minimum of the distance matrix will be the two points that are closest together.

min(dist(dat))
[1] 88.0568

Which is the distance between points 1 (412,364) and 2 (497,341). This can be easily extracted for larger amounts of points by looking at the row and column indices of the dist matrix.

 which(as.matrix(dist(dat))==min(dist(dat)),arr.ind=TRUE)

returns

  row col
2   2   1
1   1   2

Which means that the distance between the first and second points in your vectors is the shortest.

Upvotes: 3

Jason
Jason

Reputation: 1569

I don't know if this is what you want, but I put together a solution which tells you which row is the minimum distance pair to that row. There is probably a more elegant package that does this too, but it was a fun problem to solve :).

X<- sample(200:1000,10)
Y<- sample (200:1000, 10)

df<-data.frame(x=X,y=Y)
for(i in 1:nrow(df)){
    dist<-((df[i,'x']-df[,'x'])^2+(df[i,'y']-df[,'y'])^2)^1/2
    mindist<-which(dist==min(dist[dist!=0])) #gets you the row of the shortest pair
    df[i,'mindistcol']<-mindist
}

Upvotes: 1

Related Questions