Reputation: 903
For example, in Java syntax, it will be something like:
List l = new ArrayList();
l.add(new Integer(2));
l.add(new Integer(3));
l.add(new Integer(6));
l.add(new Integer(9));
Random rand = new Random();
for (int i=0; i < n; i++) {
l.add(new Integer(rand.nextInt(1000)));
}
Collections.sort(l);
l.remove(0);
But it seems it's inefficient. Any better algorithm?
Upvotes: 2
Views: 1875
Reputation: 53964
Use a binary insert (works like a binary search) for the new value. Discard the smallest. Should be quite fast.
By the way - this can be implemented as a handy extension method:
private static int GetSortedIndex( this IList list, IComparer comparer, object item, int startIndex, int endIndex )
{
if( startIndex > endIndex )
{
return startIndex;
}
var midIndex = startIndex + ( endIndex - startIndex ) / 2;
return comparer.Compare( list[midIndex], item ) < 0 ?
GetSortedIndex( list, comparer, item, midIndex + 1, endIndex ) :
GetSortedIndex( list, comparer, item, startIndex, midIndex - 1 );
}
public static void InsertSorted( this IList list, IComparer comparer, object item )
{
list.Insert( list.GetSortedIndex( comparer, item ), item );
}
Java Equivalent
public static void main(String[] args)
{
List l = new ArrayList();
l.add(new Integer(2));
l.add(new Integer(3));
l.add(new Integer(6));
l.add(new Integer(9));
Random rand = new Random();
for (int i=0; i < 10; i++) {
Integer rnd = new Integer(rand.nextInt(1000));
int pos = Collections.binarySearch(l,rnd);
if(pos < 0) pos = ~pos;
l.add(pos,rnd);
}
System.out.println(l);
}
Upvotes: 13
Reputation: 533870
Here is another solution which merges the operations into just a search, an array copy and a value set. This avoid the need for sorting or loops.
public static <T extends Comparable<T>>
void insertAndRemoveSmallest(T[] array, T t) {
int pos = Arrays.binarySearch(array, t);
if (pos < 0) pos = ~pos;
// this is the smallest entry so no need to add it and remove it.
if (pos == 0) return;
pos--;
// move all the entries down one.
if (pos > 0) System.arraycopy(array, 1, array, 0, pos);
array[pos] = t;
}
This program
public static void main(String... args) {
Integer[] ints = {2, 3, 7, 6, 9};
System.out.println("Starting with " + Arrays.toString(ints));
for (int i : new int[]{5, 1, 10, 8, 8}) {
insertAndRemoveSmallest(ints, i);
System.out.println("After adding " + i + ": " + Arrays.toString(ints));
}
}
prints
Starting with [2, 3, 7, 6, 9]
After adding 5: [3, 5, 7, 6, 9]
After adding 1: [3, 5, 7, 6, 9]
After adding 10: [5, 7, 6, 9, 10]
After adding 8: [7, 6, 8, 9, 10]
After adding 8: [6, 8, 8, 9, 10]
Upvotes: 0
Reputation: 28069
Use a TreeSet instead of a List
, it'll maintain the order such that such that the largest value will always be at SortedSet#last(). If using 1.6+ you can use NavigableSet methods; pollLast() will return and remove the highest value.
NavigableSet<Integer> set = new TreeSet<Integer>();
//... setup data
Integer highest = set.pollLast();
set.add(rand.nextInt(1000));
Integer newHighest = set.pollLast();
Upvotes: 8
Reputation: 27492
A key question is whether you need to know the top 4 items AFTER EVERY NEW ITEM IS GENERATED, or if you only need the top 4 after all the items are generated. Also, is it literally 4 top items, or is that just an example or illustration?
Because if you really are generating thousands of values and only want the top 4, I'd think that comparing each new value to each of the existing 4 and discarding if less than all of them would be a lot faster than doing many sorts. That's just 4 compares for each new item, rather than the potentially much larger number to do repeated sorts.
Similarly if you only need the top N at the end of the process, it may well be faster to collect them all, sort, and then take the top N. But again, if most of the values are being eliminated, sorting the relative positions of the "losers" could be a big waste of time. If we only want the top 4, then whether an item is #5 or #10,382,842 is irrelevant.
Upvotes: 0
Reputation: 7388
The fastest algorithm I can think of would be to replace the smallest element with the new one, if needed, and push the new one to its proper place by repeatedly swapping with adjacent elements.
EDIT: The code assumes that the array is sorted in descending order, and thus the last element is the smallest.
void Insert(int[] array, int newValue)
{
// If the new value is less than the current smallest, it should be
// discarded
if (new_value <= array[array.length-1])
return;
array[array.length-1] = newValue;
for (int i = array.length-1; i > 0; --i)
{
if (newValue <= array[i-1])
break;
// Swap array[i] with array[i-1]
array[i] = array[i-1];
array[i-1] = newValue;
}
}
Upvotes: 2
Reputation: 87603
Your pseudocode inserts a set of new items N into sorted list A of size S and then discards the smallest item. Use Collections.binarySearch() to find the insertion point. [Read the note the performance impact if your List is does not support RandomAccess. ArrayList does support RandomAccess.]
List<Integer> l = new ArrayList<Integer>();
l.add(new Integer(2));
l.add(new Integer(3));
l.add(new Integer(6));
l.add(new Integer(9));
l.ensureCapacity(l.size()+n);
Random rand = new Random();
for (int i=0; i < n; i++) {
final Integer newInt = Integer.rand.nextInt(1000);
int insertPoint = Collections.binarySearch(l, newInt);
if (insertPoint < 0) insertPoint = -(insertPoint + 1);
l.add(insertPoint, newInt);
}
l.remove(0);
But, are you sure you wanted to discard just 1 item? Or did you mean to insert a set of new items N into sorted list A of size S and keep just the S largest items. In that case, keep track of the min value:
int min = l.get(0);
l.ensureCapacity(l.size()+n);
Random rand = new Random();
for (int i=0; i < n; i++) {
final Integer newInt = Integer.rand.nextInt(1000);
if (newInt > min) {
int insertPoint = Collections.binarySearch(l, newInt);
if (insertPoint < 0) insertPoint = -(insertPoint + 1);
l.add(insertPoint, newInt);
}
}
However, if N is large, you may be better off sorting N into an sorted array by itself, discarding the smaller of N(0) or A(0), and then merging the two sorted arrays together [left as an exercise for the reader].
If you end up using an actual array, see Arrays.binarySearch and System.arraycopy.
Upvotes: 2
Reputation: 147461
I'm quite surprised no-one has mentioned this yet... The data structure you are looking for is a priority queue. It is without doubt the most efficient way of accomplishing this task. A priority queue can be implemented using a number of different methods (see the linked article), but the most common is based on a binary heap. In the self-binary variety (which is quite typical), insertion and deletion both take O(log n)
time.
There seems to be a built-in generic class in the Java library, PriorityQueue<E>
, so it would seem you can use this directly. This type does not surprisingly seemed to be based on a heap data structure, though more specific than that I cannot say. It should be very appropiate for your use, in any case.
Upvotes: 5
Reputation: 23331
I'm not sure if the above example would work, what is n? and if you cycle through adding random #'s from 1 to 1,000 you'll always end up with 1000, 999, 998 and 997 - no? I don't think adding a # and then resorting each time is efficient- it would probably be quicker to check each of the four positions and replacing with a higher #.
A lot depends on how many random #'s you'll add, to few # adds and check each of the 4 positions a lot of # adds just assume you get the highest in the range.
Upvotes: 0
Reputation: 2702
Do you really need an online one item at a time algorithm? Or are you actually parsing a larger collection of data and just want the top n items? If it's the latter, look at partial qsort.
Upvotes: 0
Reputation: 49228
ShellSort
and Natural Mergesort
are very performant (< O(n logn)) on largely pre-sorted data.
Inserting into a sorted list with binary search
needs much more time since one update requires O(n) anyway.
Alternatively, you could use heap-datastructures.
Upvotes: 0
Reputation: 625387
This will keep the size at 4 and do what you want as I understand it.
SortedSet<Integer> set = new TreeSet<Integer>();
set.add(2);
set.add(3);
set.add(6);
set.add(9);
Random rand = new Random();
for (int i=0; i < n; i++) {
int i = rand.nextInt(1000);
set.remove(set.first());
set.add(i);
}
Upvotes: 1
Reputation: 48516
A very simple optimalization would be to compare the lowest value in the sorted array (should thus be the first item) with the new value before inserting it. If the new value is greater than this value, replace the element with the new value and then resort the array.
Upvotes: 3
Reputation: 51541
I don't know whether you can change the data structure, or what other operations you need to support, but a heap would be a better fit for the kind of operations you describe.
Upvotes: 1
Reputation: 24470
If you're working with an ArrayList, you can replace the last number in the array with the new number if the new number is larger before you sort the array.
The Java Collections.sort
uses merge sort which isn't the most efficient way of sorting in this situation. You want to use a binary search to find the insertion point and then shift all the subsequent numbers along by one.
EDIT: This can all be done with just an array like so:
public static int addDiscard(int[] list, int number)
{
if (number > list[list.length - 1])
{
int index = findInsertionIndex(list, number); // use binary search
for (int i = list.length - 1; i > index; i--)
{
list[i] = list[i - 1];
}
list[index] = number;
}
}
Upvotes: 1
Reputation: 8766
Use a min-heap to store the data, and after each insertion of a new random value, delete the min in O(1) time.
After n iterations, perform n extract-min's to get the sorted list.
Upvotes: 5
Reputation: 6128
You can use binary search to insert a value into a sorted array.
Upvotes: 1