Reputation: 45
I'm trying to sort a list of dollar amounts from lowest to highest using python's built in sort ability, but when I call on it, it sorts the numbers super screwy. It starts at $10,000 then goes up to $19,0000 (which is the highest) then jumps down to $2,000 and counts up from there ostensibly because 2 is bigger than 1. I don't know how to correct for this. The code I've used is below.
numbers=[['$10014.710000000001'], ['$10014.83'],['$11853.300000000001'],
['$19060.010000000006'],['$2159.1099999999997'],['$3411.1400000000003']]
print(sorted(numbers))
Upvotes: 2
Views: 1142
Reputation: 2175
I needed to achieve a slightly simpler variant of this problem, now posting in case of use to others.
I had a directory full of files:
filenames = [
'1.dcm', '10.dcm', '11.dcm',
'12.dcm', '13.dcm', '14.dcm',
'15.dcm', '16.dcm', '17.dcm',
'18.dcm', '19.dcm', '2.dcm',
'3.dcm', '4.dcm', '5.dcm',
'6.dcm', '7.dcm', '8.dcm',
'9.dcm'
]
This output from os.listdir()
is not uncommon but I wanted them sorted in numerical order without needing the leading 0s. In Linux, you might achieve this with ls | sort -h
In Python, you can sort files named without leading zeros without relying on external libraries using lambda
functions in a single line by removing additional text and casting to an int
:
ordered_filenames = sorted(filenames, key=lambda x: int(x.replace('.dcm', ''))
This could be adjusted for the dollar problem:
ordered_dollar_amounts = sorted(
dollar_amounts,
key=lambda x: float(x.replace('$', '')
)
Upvotes: 0
Reputation: 60957
The key insight here is that the values in your list are actually strings, and strings are compared lexically: each character in the string is compared one at a time until the first non-matching character. So "aa" sorts before "ab", but that also means that "a1000" sorts before "a2". If you want to sort in a different way, you need to tell the sort
method (or the sorted
function) what it is you want to sort by.
In this case, you probably should use the decimal
module. And you want the key
attribute of the sort
method. This will sort the existing list you have, only using the converted values during the sorting process.
import decimal
def extract_sortable_value(value):
# value is a list, so take the first element
first_value = value[0]
return decimal.Decimal(first_value.lstrip('$'))
numbers.sort(key=extract_sortable_value)
Equivalently, you could do:
print(sorted(numbers, key=extract_sortable_value))
Demo: https://repl.it/repls/MiserableDarkPatches
Upvotes: 4
Reputation: 11242
Your numbers are currency values. So as pointed out in the comments below, it might make sense to use Python's decimal
module which offers several advantages over the float
datatype. (See link for further information.)
If, however, this is only an exercise for better getting to know Python, as I suspect. You might look for a simpler solution:
The reason, why your sorting doesn't work, is because your numbers are stored in the list inside another list as a string. You have to convert them to integers or floats before sorting has the effect you're looking for:
numbers=[
['$10014.710000000001'],
['$10014.83'],
['$11853.300000000001'],
['$19060.010000000006'],
['$2159.1099999999997'],
['$3411.1400000000003']
]
numbers_float = [float(number[0][1:]) for number in numbers]
numbers_float.sort()
print(numbers_float)
Which prints:
[2159.1099999999997, 3411.1400000000003, 10014.710000000001, 10014.83, 11853.300000000001, 19060.010000000006]
When you look at float(number[0][1:])
, then [0]
takes the first (and only) number of your (inner) number list, [1:]
strips the $
sign and finally float
does the conversion to floating point number.
If you want the $
sign back:
for number in numbers_float:
print("${}".format(number))
Which prints:
$2159.1099999999997
$3411.1400000000003
$10014.710000000001
$10014.83
$11853.300000000001
$19060.010000000006
Upvotes: 2
Reputation: 2468
You are not sorting numbers but strings, which explains the "weird" result. Instead, change your type to float and sort the resulting list:
In [3]: sorted([[float(el[0][1:])] for el in numbers])
Out[3]:
[[2159.1099999999997],
[3411.1400000000003],
[10014.710000000001],
[10014.83],
[11853.300000000001],
[19060.010000000006]]
I need the el[0]
because every number is inside its own list, which is not a good style, but I guess you have your reasons for this. The [1:]
strips away the $
sign.
EDIT really good point made in the comments. More robust solution:
from decimal import Decimal
import decimal
decimal.getcontext().prec = 4
sorted([Decimal(el[0][1:]) for el in numbers])
Out[8]:
[Decimal('2159.1099999999997'),
Decimal('3411.1400000000003'),
Decimal('10014.710000000001'),
Decimal('10014.83'),
Decimal('11853.300000000001'),
Decimal('19060.010000000006')]
Upvotes: 2