Reputation: 1189
I was solving bigSorting() problem from hackerrank:
Consider an array of numeric strings where each string is a positive number with anywhere from
1
to10^6
digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and return the sorted array.
I know it works as follows:
def bigSorting(unsorted):
return sorted(unsorted, key=int)
But I didnt guess this approach earlier. Initially I tried below:
def bigSorting(unsorted):
int_unsorted = [int(i) for i in unsorted]
int_sorted = sorted(int_unsorted)
return [str(i) for i in int_sorted]
However, for some of the test cases, it was showing time limit exceeded. Why is it so?
PS: I dont know exactly what those test cases were as hacker rank does not reveal all test cases.
Update
My code is following three steps:
str
to int
conversionint
to str
convertsionWhy I felt that my code in OP should also run is that its also O(n log n). But seems that mere same time complexity is not enough here? But then I wanted to know which conversion is avoided by sorted(unsorted, key=int)
?
Q1. The third one, int
to str
conversion?
Q2. If Yes is the answer to Q1, then is it the only reason why my code doesnt work and onliner-sorted
-solution works? That is, is there anything else that my code is doing extra above onliner-sorted
-solution?
Q3. Also wont onliner-sorted
-solution do str
to int
conversion multiple times for the same number during sorting? That is, everytime it has to compare a specific number, does it have to convert it from str
to int
? Or sorted
is implemented in such a way that it has to convert each element from str
to int
only once?
Upvotes: 2
Views: 661
Reputation: 11238
Here you are converting list of numerical string to list of integer and vice a versa.
for each conversion of string to integer ie '123456'
to 123456
time is consumed and for every 10x in strig size conversion is increased by 100x time . proofed by @kellybundy in his post
adding code and run time if in case link disappear
from timeit import timeit
print(timeit('int("9"*10**5)', number=1))
print(timeit('int("9"*10**6)', number=1))
has run time
0.0568113109911792
5.422084125995752
so we can say time complexity of your conversion (string to integer)is O(nnm), where n is size of string, m is size of list.
This is where your code is taking time (conversion 2 time apart from sorting).
what you can do apart from orignal solution (sorted(unsorted, key = int)
use python sorting of string numerical, where it sort them in lexical order ie 0,1,2,3,4
def bigSorting(unsorted):
return sorted(unsorted, key= lambda x:(len(x), x))
here code doesn't need to convert the element and then sort, but just sort only. so it will take NlogN time
Now a much much faster solution by @kellybundy (he should add this as solution of his own )
def bigSorting(unsorted):
return sorted(unsorted, key= lambda x:int(x, 16))
Upvotes: 2
Reputation: 13242
In your code you:
list
of ints
converting str -> int
.list
of those ints
.list
of str
, converting int --> str
.That's three completely new (possibly massive) lists with multiple conversions.
Using the key=int
sorting argument performs the necessary conversions internally during the sort process itself, which means there is never the costly conversion from int -> str
.
We can supply Hacker Rank with just this code, and it passes all tests:
import os
if __name__ == '__main__':
fptr = open(os.environ['OUTPUT_PATH'], 'w')
n = int(input())
result = sorted((input() for _ in range(n)), key=int)
fptr.write('\n'.join(result))
fptr.write('\n')
fptr.close()
Still not 'optimal' perhaps, but from reading through the comments, this does seem to be an expected option. Those performing bucket/merge sorts didn't see much performance gains.
Upvotes: 1