user1313954
user1313954

Reputation: 921

how to sort numbers to a list based on proximity in R

Let's say I had a list of numbers in a vector. I'm trying to come up with a script that will divide or sort the list into (not necessarily even) sets whose numbers are fairly close to each other relative to the other numbers in the vector. you can assume that the numbers in the vector are in ascending order.

my_list<- c(795, 798, 1190, 1191, 2587, 2693, 2796, 3483, 3668)

That is, I need help coming up with a script that will divide and assign these numbers into sets where

set_1<- c(795, 798) # these 2 numbers are fairly close to each other
set_2<- c(1190, 1191) # these numbers would be another set
set_3<- c(2587, 2693, 2796) # these numbers would be another set relative to the other numbers
set_4<- c(3483, 3668)  # the last set

any help or suggestions are greatly appreciated.

Upvotes: 4

Views: 440

Answers (2)

Tyler Rinker
Tyler Rinker

Reputation: 109864

Flodel's answer is way better as I know enough about cluster analysis to fill a small thimble and still have room left for 2 peas, but here's basic response:

split(my_list, cut(my_list, breaks=seq(0, 4000, by=1000)))

Upvotes: 3

flodel
flodel

Reputation: 89057

In general, what you are asking for is called Cluster Analysis, for which there are many possible methods and algorithms, many of which are already available in R packages listed here: http://cran.r-project.org/web/views/Cluster.html.

Here is for example how you can cluster your data using hierarchical clustering.

tree <- hclust(dist(my_list))
groups <- cutree(tree, h = 300)
# [1] 1 1 2 2 3 3 3 4 4
split(my_list, groups)
# $`1`
# [1] 795 798
# 
# $`2`
# [1] 1190 1191
# 
# $`3`
# [1] 2587 2693 2796
# 
# $`4`
# [1] 3483 3668

Upvotes: 5

Related Questions