User
User

Reputation: 67

Given an array of long numeric strings, sort them in ascending order

I'm trying to solve the HackerRank problem Big Sorting:

Problem Statement:

Consider an array of numeric strings where each string is a positive number with anywhere from 1 to 1000000 digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and print each element of the sorted array on a new line.

Function Description

Complete the bigSorting function in the editor below. It should return the sorted string array.

bigSorting has the following parameter(s):

  • unsorted: an unsorted array of integers as strings

Input Format

The first line contains an integer N, denoting the number of strings in unsorted.

Each of the N subsequent lines contains an integer string unsorted[i].

Constraints

  • 1 <= N <= 200000
  • Each string is guaranteed to represent a positive integer without leading zeros.
  • The total number of digits across all strings in unsorted is between 1 and 1000000 (inclusive).

Output Format

Print each element of the sorted array on a new line.

My code:

n = int(raw_input())

unsorted = []

for _ in xrange(n):
    unsorted_item = raw_input()
    unsorted.append(unsorted_item)

result = sorted(unsorted)
print (result)

But it's giving a completely different result:

Input

6
31415926535897932384626433832795
1
3
10
3
5

Expected Output

1
3
3
5
10
31415926535897932384626433832795

Actual Output

1
10
3
3
31415926535897932384626433832795
5

Upvotes: 2

Views: 2215

Answers (2)

Anatolii
Anatolii

Reputation: 14670

What you’re doing is not enough because you need to sort by text length - shorter strings come first - and in the lexicographical order if strings are of the same length. So, a simple solution could be this:

n = int(input())
unsorted = []
for _ in range(n):
    unsorted_item = input()
    unsorted.append(unsorted_item)

for element in sorted(unsorted, key=int):
    print(element)

Upvotes: 4

kaya3
kaya3

Reputation: 51112

Anatolii's solution is certainly the simplest one, because these strings all represent numbers, and they should be sorted in numerical order.

Here's an alternative solution which doesn't require parsing the strings as ints: because they're all positive, then numbers with more digits are always larger, and numbers with equally many digits can be compared as strings. So, first we compare by string length, then strings of equal length are compared the way strings are usually compared.

We can use a tuple (len(s), s) as the comparison key; they are compared by their first component (string length), then by their second component (the string itself) as a tie-breaker.

result = sorted(unsorted, key=lambda s: (len(s), s))

This is a much more efficient solution on average, because parsing ints is slow, but comparing strings is usually fast because they usually differ within the first few digits. In the worst case, you have to compare every digit, but parsing as an int always requires looking at every digit.

I tried with lists of 1,000 strings representing numbers in the range 0 to 10^(10^4), i.e. numbers up to 10^4 digits long; the solution without int is over a hundred times faster:

>>> import random, timeit
>>> lst = [ str(random.randint(0, 10**10**4)) for i in range(1000) ]
>>> timeit.timeit(lambda: sorted(lst, key=lambda s: (len(s), s)), number=10)
0.01330024599974422
>>> timeit.timeit(lambda: sorted(lst, key=int), number=10)
4.78583431900006

Upvotes: 5

Related Questions