Code Review Asked by Nishil Patel on October 28, 2020
My mergesort implementation currently takes a very long time, for a list with 1 million elements it takes over 130 seconds to get the list sorted.
Could someone kindly have a look at my code and suggest what could I do to improve it?
Is there anything particular in my code that is taking significantly long?
def splitlist(L): #splits the list to a list of individual listed elements (e.g. [1,2] to [[1],[2]])
envelop = lambda x: [x]
return(list(map(envelop,L)))
def merge(L_1,L_2): #merges two (already sorted) lists to one sorted list
N = []
while len(L_1) > 0 and len(L_2) > 0:
if L_1[0] > L_2[0]:
N += [L_2.pop(0)]
else:
N += [L_1.pop(0)]
if len(L_1) == 0:
N += L_2
else:
N += L_1
return(N)
#performs one round of pairwise merges (e.g. [[2],[1],[4],[3]] to [[1,2],[3,4]]), or [[5,10],[1,8],[2,3]] to [[1,2,3,5,8,10]])
def mergelist(L):
N = []
if len(L) % 2 == 0:
for i in range(0,len(L)//2):
N += [merge(L[2*i],L[2*i + 1])]
else:
for i in range(0,len(L)//2 - 1):
N += [merge(L[2*i],L[2*i + 1])]
N += [merge(merge(L[-3],L[-2]),L[-1])]
return(N)
def mergesort(L): #recursively performs mergelist until there is only 1 sorted list remaining
L = splitlist(L)
while len(L) > 1:
L = mergelist(L)
return(L[0])
Here is my code for generating the million elements:
rlist = random.sample(range(0,2000000),1000000)
The pop(0)
takes linear time. Do that differently, in O(1) time. The standard way uses index variables. See this question's answers for some more pythonic ways. Or you could merge from right to left, using pop()
, and then in the end reverse()
the result.
One way to do the latter:
def merge(L1, L2):
"""Merges two (already sorted) lists to one sorted list."""
N = []
while L1 and L2:
L = L1 if L1[-1] > L2[-1] else L2
N.append(L.pop())
N.reverse()
N[:0] = L1 or L2
return N
Other changes I did and which you can apply in the other parts of your code as well:
L
, that's what PEP 8 says. And then I kept N
for consistency. Usually I'd use result
or maybe merged
. Don't know why you chose N
. If you have a meaningful word that starts with "n", then I suggest using that.len(L_1) > 0
with the normal L1
non-emptiness check.N += [x]
with the normal N.append(x)
.Just another way, replacing that one "long" line to determine L
with a clearer but slower way:
def merge(L1, L2):
"""Merges two (already sorted) lists to one sorted list."""
N = []
def last(L):
return L[-1]
while L1 and L2:
L = max(L2, L1, key=last)
N.append(L.pop())
N.reverse()
N[:0] = L1 or L2
return N
For some fun, two list comprehension hacks:
def merge(L1, L2):
"""Merges two (already sorted) lists to one sorted list."""
def end(L):
return L[-1:]
return [max(L2, L1, key=end).pop() for _ in L1 + L2][::-1]
def merge(L, R):
"""Merges two (already sorted) lists to one sorted list."""
return [(R, L)[L[-1:] > R[-1:]].pop() for _ in L + R][::-1]
And I don't want to leave without a much faster way:
def merge(L1, L2):
"""Merges two (already sorted) lists to one sorted list."""
return sorted(L1 + L2)
That's O(n) because of Timsort. And fast O(n) because of C code. If you think using the mighty sorted
inside a mergesort defeats the purpose of writing the mergesort in the first place: Even that can be meaningful, if you're not just doing mergesort. At least once I wrote a mergesort with embedded counting of something, and I indeed used sorted
just for the merging. Because that made my solution faster/shorter/simpler.
Even more efficient (both space and time):
def merge(L1, L2):
"""Merges two (already sorted) lists to one sorted list."""
L1 += L2
L1.sort()
return L1
(If L2
can be longer than L1
, it might be advantageous to insert L1
into L2
instead.)
Correct answer by superb rain on October 28, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP