stuppie
stuppie

Reputation: 385

Group numbers into closest groups

For example, I have the numbers 46,47,54,58,60, and 66. I want to make group them in such a way as to make the largest possible group sizes. Numbers get grouped if their values are within plus or minus 10 (inclusive). So, depending on which number you start with, for this example there can be three possible outcomes (shown in the image).

What I want is the second possible outcome, which would occur if you started with 54, as the numbers within 44 to 64 would be grouped, leaving 66 by itself, and creating the largest group (5 items).

I realize I could easily brute force this example, but I really have a long list of numbers and it needs to do this across thousands of numbers.. Can anyone tell me about algorithms I should be reading about or give me suggestions?

enter image description here

Upvotes: 3

Views: 2433

Answers (2)

stuppie
stuppie

Reputation: 385

This is what I wanted to do. Sorry I didn't explain myself very well. Each iteration finds the largest possible group, using the numbers that are left after removing the previous largest group. Matlab code:

function out=groupNums(y)
d=10;
out=[];
if length(y)==1
    out=y;
    return
end
group=[];
for i=1:length(y)
    group{i}=find(y<=y(i)+d & y>=y(i)-d);
end
[~,idx]=max(cellfun(@length,group));

out=[out,{y(group{idx})}];
y(group{idx})=[];
out=[out,groupNums(y)];

Upvotes: 0

Fallen
Fallen

Reputation: 4565

You can simply sort the array first. Then for every i th number you can do a binary search to find the right most number that's within ith number + 20 range, let the position of such right most index is X. You have to find the largest (X-i+1) for all ith numbers and we are done :)

Runtime analysis: Runtime for this algorithm will be O(NlgN), where N is the number of items in the original array.

A better solution: Let's assume we have the array ar[] and ar[] has N items.

  1. sort ar[] in non decreasing order
  2. set max_result = 0, set cur_index = 0, i=0
  3. increase i while i
  4. set max_result to max(max_result,i-cur_index+1)
  5. set cur_index=cur_index+1
  6. if cur_index

Runtime Analysis: O(N), where N is the number of items in the array ar[] as cur_index will iterate through the array exactly once and i will iterate just once too.

Correctness: as the array is sorted in non decreasing order, if i < j and j < k and ar[i]+20 > ar[k] then ar[j]+20 > ar[k] too. So we don't need to check for these items those are already checked for previous item.

Upvotes: 2

Related Questions