Reputation: 33
I am trying to solve LeetCode problem 295. Find Median from Data Stream:
The median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value and the median is the mean of the two middle values.
- For example, for
arr = [2,3,4]
, the median is3
.- For example, for
arr = [2,3]
, the median is(2 + 3) / 2 = 2.5
.Implement the
MedianFinder
class:
MedianFinder()
initializes theMedianFinder
object.void addNum(int num)
adds the integernum
from the data stream to the data structure.double findMedian()
returns the median of all elements so far.
Answers within 10-5 of the actual answer will be accepted.Example 1:
[...]
MedianFinder medianFinder = new MedianFinder(); medianFinder.addNum(1); // arr = [1] medianFinder.addNum(2); // arr = [1, 2] medianFinder.findMedian(); // return 1.5 (i.e., (1 + 2) / 2) medianFinder.addNum(3); // arr[1, 2, 3] medianFinder.findMedian(); // return 2.0
For this question, I am adding numbers to a heap and then later use the smallest one to do some operations.
When I tried to do the operation, I found out that my heap returns a different value when doing heapq.heappop(self.small)
than when doing self.small[0]
.
Could you please explain this to me? Any hint is much appreciated.
(Every number in self.small
is added using heapq.heappush
)
Here is my code when it works:
class MedianFinder:
def __init__(self):
self.small, self.large = [], []
def addNum(self, num):
heapq.heappush(self.small, -1 * num)
if (self.small and self.large) and -1 * self.small[0] > self.large[0]:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.small) > len(self.large) + 1:
val = -1 * heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = -1 * heapq.heappop(self.large)
heapq.heappush(self.small, val)
def findMedian(self):
if len(self.small) > len(self.large):
return -1 * self.small[0]
elif len(self.small) < len(self.large):
return self.large[0]
else:
return (-1 * self.small[0] + self.large[0]) / 2
For the last line, if I change:
-1 * self.small[0] + self.large[0]
into:
-1 * heapq.heappop(self.small) + heapq.heappop(self.large)
then the tests fail.
Why would that be any different?
Upvotes: 0
Views: 62
Reputation: 350242
When you change -1 * self.small[0] + self.large[0]
into -1 * heapq.heappop(self.small) + heapq.heappop(self.large)
then it will still work the first time findMedian
is called, but when it is called again, it will return (in general) a different result. The reason is that with heappop
, you remove a value from the heap. This should not happen, as this changes the data that a next call of findMedian
will have to deal with. findMedian
is supposed to leave the data structure unchanged.
Note how the challenge says this:
double findMedian() returns the median of all elements so far.
I highlight the last two words. These indicate that findMedian
is not (only) called when the whole stream of data has been processed, but will be called several times during the processing of the data stream. That makes it crucial that findMedian
does not modify the data structure, and so heappop
should not be used.
Upvotes: 2