Reputation: 79
I am trying to solve a problem which asks for merging two sorted linked lists. Although my code passes most tests, it does not do well with large inputs, very long linked lists. What can i improve? Thanks! This is my code:
#
# For your reference:
#
# SinglyLinkedListNode:
# int data
# SinglyLinkedListNode next
#
#
def mergeLists(head1, head2):
if head1 is None:
return head2
if head2 is None:
return head1
p1=head1
p2=head2
while head1.data >= p2.data:
nxt = p2.next
p2.next = head1
head1 = p2
p2 = nxt
p1 = head1
while p1 is not None and p2 is not None:
if p1.data <= p2.data and (p1.next is None or p1.next.data >= p2.data):
next2 = p2.next
p2.next = p1.next
p1.next = p2
p2 = next2
p1 = p1.next
return head1
Upvotes: 0
Views: 130
Reputation: 1285
def mergeLists(head1, head2):
if not head1:
return head2
elif not head2:
return head1
if head1.data <= head2.data:
res = head1
res.next = mergeLists(head1.next, head2)
else:
res = head2
res.next = mergeLists(head1, head2.next)
return res
Upvotes: 0
Reputation: 408
It looks like you are do a lot of duplicate assignments, as well as duplicate checks. I tried to write a code with a minimal amount of both. Of course, if the goal was to write easy-to-read code, then a recursive algorithm would be the best solution. But not in our case. Also note that it makes no sense to check the sentinel on two lists at the same time, since the length of the lists is unknown and can be different at any time.
class L(object):
__slots__ = 'data', 'next'
def __repr__(self): # just for tests
return f'{self.data!r}, {self.next!r}'
def __eq__(self, other): # just for tests
return (self.data == other.data) and (self.next == other.next)
def merge0(a: L, b: L, sentinel=None):
if a is sentinel:
return b
if b is sentinel:
return a
if a.data > b.data:
head = last = b
b = b.next
else:
head = last = a
a = a.next
while True:
if a.data > b.data:
last.next = last = b
if (b := b.next) is sentinel:
last.next = a
return head
else:
last.next = last = a
if (a := a.next) is sentinel:
last.next = b
return head
The code above operates with 6 main pointers:
a
- the head of first linked list. a.next
can be sentinel
;b
- the head of second linked list. b.next
can be sentinel
;head
- the head of new merged linked list. It is either a
or b
. Not useful during the main loop;last
- the current last object of new merged linked list. Initially is head
;last.next
- this pointer is undefined during the main loop. It can be a
, b
, a.next
or b.next
;sentinel
- unique singleton object that marks the end of any linked list.Each step of the loop, we find out smaller element and assign it to last.next
. Then we create new last
and check out possible end of the list with this element. Then the loop repeats.
However, this is still not an optimal algorithm. Imagine that we have ll1(0, 2, 3, 5, 5, 7, 8)
, ll2(6, 7, 9, 10)
and merged ll(0, 2, 3, 5, 5, 6, 7, 7, 8, 9, 10)
. In fact, the first 5 elements of ll
are the five elements of the ll1
. According to the merge0
main loop, each of these elements (except the first) has been reassigned to its left neighbor. The inefficiency of this approach is especially notable when merged lists have a large spread in values.
Secondly, it would be nice to replace the explicit check of sentinel of each list. The sentinel
object doesn't have to be None
and may be implemented as a custom object with a native exception throwing mechanism.
Improved version:
@object.__new__
class sentinel(object):
__slots__ = ()
@property
def next(self):
raise self.Signal()
@property
def data(self):
raise self.Signal()
class Signal(Exception):
pass
def __eq__(self, other): # just for tests
return (other is self) or (other is None)
def merge1(a: L, b: L, sentinel=sentinel):
if a is sentinel:
return b
if b is sentinel:
return a
if a.data <= b.data:
head = x = a
y = b
else:
head = x = b
y = a
try:
while True:
while (x := (last := x).next).data <= y.data:
pass
last.next = y
while (y := (last := y).next).data <= x.data:
pass
last.next = x
except sentinel.Signal:
last.next = y if (x is sentinel) else x
return head
Some test-related code:
def random_sorted_ll(length, min_num, span, sentinel=None):
"""creates sorted linked list of random integers"""
assert length >= 1
from random import randint
from itertools import repeat
head = last = L()
head.data = d = randint(min_num, min_num + span)
ll_factory = map(L.__new__, repeat(L, length-1))
for l in ll_factory:
last.next = last = l
l.data = d = randint(d, d+span)
last.next = sentinel
# last.data should be close to (length * span / 2 + min_num)
return head
def copy_ll(head, prev_sentinel=None, new_sentinel=None):
"""creates deepcopy of given linked list head with new sentinel"""
if head is prev_sentinel:
return new_sentinel
head_copy = last = L()
last.data = head.data
head = head.next
while head is not prev_sentinel:
last.next = last = L()
last.data = head.data
head = head.next
last.next = new_sentinel
return head_copy
Some tests:
>>> j0 = random_sorted_ll(length=100, min_num=4, span=5)
>>> j1 = copy_ll(j0)
>>> j2 = copy_ll(j0, prev_sentinel=None, new_sentinel=sentinel)
>>> k0 = random_sorted_ll(length=111, min_num=4, span=5)
>>> k1 = copy_ll(k0)
>>> k2 = copy_ll(k0, prev_sentinel=None, new_sentinel=sentinel)
>>> (j0 == j1 == j2) and (k0 == k1 == k2)
True
>>> m0 = mergeLists(j0, k0)
>>> m1 = merge0(j1, k1)
>>> m2 = merge1(j2, k2)
>>> m0 == m1 == m2
True
>>> (m2 is j2) or (m2 is k2)
True
Comparisons:
def compare_speed(len0, len1, mn0, mn1, span0, span1):
print(f'\nll in an approximate range ({mn0:_d}, {len0*span0//2+mn0:_d}) with {len0=:_d}')
j0 = random_sorted_ll(len0, mn0, span0)
j1 = copy_ll(j0)
j2 = copy_ll(j0, None, sentinel)
print(f'll in an approximate range ({mn1:_d}, {len1*span1//2+mn1:_d}) with {len1=:_d}')
k0 = random_sorted_ll(len1, mn1, span1)
k1 = copy_ll(k0)
k2 = copy_ll(k0, None, sentinel)
from timeit import timeit
print('mergeLists:', timeit((lambda: mergeLists(j0, k0)), number=1), 'sec')
print('merge0:', timeit((lambda: merge0(j1, k1)), number=1), 'sec')
print('merge1:', timeit((lambda: merge1(j2, k2)), number=1), 'sec')
compare_speed(1_111_111, 1_000_000, 0, 0, 5, 5)
compare_speed(1_111_111, 1_000_000, 0, -25_000_000, 5, 100)
compare_speed(9_000_000, 9_111_111, 0, 0, 5, 5)
My results:
ll in an approximate range (0, 2_777_777) with len0=1_111_111
ll in an approximate range (0, 2_500_000) with len1=1_000_000
mergeLists: 0.46664454099999997 sec
merge0: 0.2180013780000003 sec
merge1: 0.16215903499999996 sec
ll in an approximate range (0, 2_777_777) with len0=1_111_111
ll in an approximate range (-25_000_000, 25_000_000) with len1=1_000_000
mergeLists: 0.4489969129999736 sec
merge0: 0.16649408400007815 sec
merge1: 0.1128580580000289 sec
ll in an approximate range (0, 22_500_000) with len0=9_000_000
ll in an approximate range (0, 22_777_777) with len1=9_111_111
mergeLists: 4.245676940000003 sec
merge0: 1.9906823259999982 sec
merge1: 1.548923153000004 sec
Upvotes: 1
Reputation: 27211
Here's how we do a merge operation by constructing a new list rather than modifying existing nodes:
def merge(p1, p2):
while p1 is not None and p2 is not None:
if p1.data < p2.data:
yield p1.data
p1 = p1.next
else:
yield p2.data
p2 = p2.next
while p1 is not None:
yield p1.data
p1 = p1.next
while p2 is not None:
yield p2.data
p2 = p2.next
def mergeLists(p1, p2):
return list(merge(p1, p2))
(Instead of yield
, you can also .append()
to a temporary list.)
If you must modify the nodes, you'll need to be a bit more careful that you don't modify anything you shouldn't be. Try using the method I proposed as a template.
Upvotes: 0