Reputation: 16189
Given:
x = [1.23, 2.0, 3.45, 4.1]
then:
middle = numpy.median(x)
Were the size of list x
odd, I could use x[x.index(middle)-1]
and x[x.index(middle)+1]
to get the two numbers either side of the middle. This won't work in the above case, since the median is not present in x
. Is there is standard approach that can handle both even-sized and odd-sized lists?
Upvotes: 3
Views: 1899
Reputation: 5658
You can use the bisect
module:
x = [1.23, 2.0, 3.45, 4.1]
def select_near_median(list_):
# gets two numbers either side of median
# in a sorted list
# bisect finds the index where median number
# can be inserted
import bisect
import statistics
list_ = sorted(list_)
med = statistics.median(list_)
ind = bisect.bisect_left(list_, med)
if med == list_[ind]:
left = ind
right = left + 1
else:
left = ind - 1
right = ind
return list_[left], list_[right]
print(select_near_median([1,2,3,4]))
(2, 3)
print(select_near_median([1,2,3,4,5]))
(3, 4)
print(select_near_med(x))
(2.0, 3.45)
Upvotes: 0
Reputation: 180391
To get the median you have to have a sorted list so this is a simple math problem, if the list length is uneven you want the halfway point - 1
and the halfway point + 1
, if the list length is even the median is the average of the two middle numbers so you want those two middle numbers.
def get_two(l):
ln = len(l)
half = ln // 2
return x[half-1], x[half + ln % 2]
Upvotes: 2
Reputation: 35998
By definition, a median is a value that divides your samples into 2 halves.
The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one (e.g., the median of {3, 3, 5, 9, 11} is 5). If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values (the median of {3, 5, 7, 9} is (5 + 7) / 2 = 6).
So, you need to
Possible approaches include:
numpy.partition
works (which numpy.median
uses internally)So, basically, what you need is (altered code from the linked source for your 1-D case):
if sz % 2 == 0:
part = partition(a, ((sz // 2) - 1, sz // 2))
else:
part = partition(a, (sz - 1) // 2)
then retrieve the corresponding elements.
Note, however, if you strive for efficiency, that there's quite an overhead converting data into ndarray
.
Upvotes: 1
Reputation: 10588
These are the numbers you are looking for:
x[(len(x)-1)/2 - len(x)%2], x[len(x)/2 + len(x)%2]
Upvotes: 3
Reputation: 1230
If the input list is unsorted (say x = [1.23, 3.45, 4.1, 2.0]
) then you want to iterate through and find the two quantities of interest (this would also work for sorted inputs of course). Something like this:
largestSmallerThanMedian = x[0]
smallestLargerThanMedian = x[len(x)-1]
for n in x:
if (n < middle) and (n >= largestSmallerThanMedian):
largestSmallerThanMedian = n
if (n > middle) and (n <= smallestLargerThanMedian):
smallestLargerThanMedian = n
And then largestSmallerThanMedian
and smallestLargerThanMedian
would have the two quantities of interest.
Upvotes: 1
Reputation: 13087
Assuming that the length of the sorted input list is N
then it seems to me that you want access elements N/2-1
and (N+1)/2
(assuming integer division), i.e.,
[1.23, 2.0, 3.45, 4.1] => N = 4 thus N/2-1 = 1 and (N+1)/2 = 2
[1.23, 2.0, 3.45, 4.1, 5.6] => N = 5 thus N/2-1 = 1 and (N+1)/2 = 3
Upvotes: 3