Reputation: 7881
I have many (hundreds of thousands, m) sets of doubles d, ~5-10 (n, constant small) long. These doubles are essentially randomly distributed. I need to get the median of each set: because m is very large, we need to calculate the median pretty quickly...these sets are pretty small though, so I think that is going to play a significant role in choosing how to do the median. I know I can use nth_element to get the median in O(n) with the selection algorithm, which I know I'm not going to beat in complexity. However, because of the small constant n, I am probably looking for the method that simply has the smallest overhead.
I have found a bunch of different ways to do the median (below) but am just curios if anyone knows the "correct" method to use here.
Min max heaps (O(n) build time, constant access, probably too much overhead)
This question from 2010 Which may be out of date (new STL/Boost code may already implement this stuff), also focuses more on time complexity than overhead.
Upvotes: 1
Views: 1199
Reputation: 1276
if it is std::set (you didnt answer to BoBTFish) then it is already sorted. Hence, you will get the median by iterating to n/2 which is always better or equal O(n), typically is should be O(ld n). n-th element will not help here.
Upvotes: 0
Reputation: 26259
This may not scale well to your data sizes, but it's a code snippet I found (can't remember where) and use in my image processing functions to get the median of 9 unsigned char pixels.
// optimised median search on 9 values
#define PIX_SWAP(a, b) { unsigned char uTemp = (a); (a) = (b); (b) = uTemp; }
#define PIX_SORT(a, b) { if ((a) > (b)) PIX_SWAP((a), (b)); }
unsigned char GetMedian9(unsigned char *pNine)
{
// nb - this is theoretically the fastest way to get the median of 9 values
PIX_SORT(pNine[1], pNine[2]); PIX_SORT(pNine[4], pNine[5]); PIX_SORT(pNine[7], pNine[8]);
PIX_SORT(pNine[0], pNine[1]); PIX_SORT(pNine[3], pNine[4]); PIX_SORT(pNine[6], pNine[7]);
PIX_SORT(pNine[1], pNine[2]); PIX_SORT(pNine[4], pNine[5]); PIX_SORT(pNine[7], pNine[8]);
PIX_SORT(pNine[0], pNine[3]); PIX_SORT(pNine[5], pNine[8]); PIX_SORT(pNine[4], pNine[7]);
PIX_SORT(pNine[3], pNine[6]); PIX_SORT(pNine[1], pNine[4]); PIX_SORT(pNine[2], pNine[5]);
PIX_SORT(pNine[4], pNine[7]); PIX_SORT(pNine[4], pNine[2]); PIX_SORT(pNine[6], pNine[4]);
PIX_SORT(pNine[4], pNine[2]); return(pNine[4]);
}
#undef PIX_SWAP
#undef PIX_SORT
EDIT - Ok, it's also referenced in this answer too
Upvotes: 1