Reputation: 748
The pseudo codes:
S = {};
Loop 10000 times:
u = unsorted_fixed_size_array_producer();
S = sort(S + u);
I need an efficient implementation of sort, which takes a sorted array and an unsorted one, then sort them all. But here we know after a few iterations, size(S) will be much bigger than size(u), that's a prior.
Update: There's another prior: the size of u is known, say 10 or 20, and the looping times is also known.
Update: I implemented the algorithm that @Dukelnig advised in C https://gist.github.com/blackball/bd7e5619a1e83bd985a3 which fits for my needs. Thanks!
Upvotes: 1
Views: 197
Reputation: 55609
Sort u
, then merge S
and u
.
Merging simply involves iterating through two sorted arrays at the same time, and picking the smaller element and incrementing that iterator at each step.
The running time is O(|u| log |u| + |S|)
.
This is very similar to what merge sort does, so that it would result in a sorted array can be derived from there.
Some Java code for merge, derived from Wikipedia: (the C code wouldn't look all that different)
static void merge(int S[], int u[], int newS[])
{
int iS = 0, iu = 0;
for (int j = 0; j < S.length + u.length; j++)
if (iS < S.length && (iu >= u.length || S[iS] <= u[iu]))
newS[j] = S[iS++]; // Increment iS after using it as an index
else
newS[j] = u[iu++]; // Increment iu after using it as an index
}
This can also be done in-place (in S, assuming it has enough additional space) by going from the back.
Here's some working Java code that does this:
static void mergeInPlace(int S[], int SLength, int u[])
{
int iS = SLength-1, iu = u.length-1;
for (int j = SLength + u.length - 1; j >= 0; j--)
if (iS >= 0 && (iu < 0 || S[iS] >= u[iu]))
S[j] = S[iS--];
else
S[j] = u[iu--];
}
public static void main(String[] args)
{
int[] S = {1,5,9,13,22, 0,0,0,0}; // 4 additional spots reserved here
int[] u = {0,10,11,15};
mergeInPlace(S, 5, u);
// prints [0, 1, 5, 9, 10, 11, 13, 15, 22]
System.out.println(Arrays.toString(S));
}
To reduce the number of comparisons, we can also use binary search (although the time complexity would remain the same - this can be useful when comparisons are expensive).
// returns the first element in S before SLength greater than value,
// or returns SLength if no such element exists
static int binarySearch(int S[], int SLength, int value) { ... }
static void mergeInPlaceBinarySearch(int S[], int SLength, int u[])
{
int iS = SLength-1;
int iNew = SLength + u.length - 1;
for (int iu = u.length-1; iu >= 0; iu--)
{
if (iS >= 0)
{
int index = binarySearch(S, iS+1, u[iu]);
for ( ; iS >= index; iS--)
S[iNew--] = S[iS];
}
S[iNew--] = u[iu];
}
// assert (iS != iNew)
for ( ; iS >= 0; iS--)
S[iNew--] = S[iS];
}
If S
doesn't have to be an array
The above assumes that S
has to be an array. If it doesn't, something like a binary search tree might be better, depending on how large u
and S
are.
The running time would be O(|u| log |S|)
- just substitute some values to see which is better.
Upvotes: 3
Reputation: 13259
Say we have a big sorted list of size n
and a little sorted list of size k
.
Binary search, starting from the end (position n-1
, n-2
, n-4
, &c) for the insertion point for the largest element of the smaller list. Shift the tail end of the larger list k
elements to the right, insert the largest element of the smaller list, then repeat.
So if we have the lists [1,2,4,5,6,8,9]
and [3,7]
, we will do:
[1,2,4,5,6, , ,8,9]
[1,2,4,5,6, ,7,8,9]
[1,2, ,4,5,6,7,8,9]
[1,2,3,4,5,6,7,8,9]
But I would advise you to benchmark just concatenating the lists and sorting the whole thing before resorting to interesting merge procedures.
Upvotes: 0
Reputation:
So if the size of S is much more than the size of u, isn't what you want simply an efficient sort for a mostly sorted array? Traditionally this would be insertion sort. But you will only know the real answer by experimentation and measurement - try different algorithms and pick the best one. Without actually running your code (and perhaps more importantly, with your data), you cannot reliably predict performance, even with something as well studied as sorting algorithms.
Upvotes: 0
Reputation: 320481
If you really really have to use a literal array for S
at all times, then the best approach would be to individually insert the new elements into the already sorted S
. I.e. basically use the classic insertion sort technique for each element in each new batch. This will be expensive in a sense that insertion into an array is expensive (you have to move the elements), but that's the price of having to use an array for S
.
Upvotes: 0