Matt
Matt

Reputation: 3652

Why does Python sort put upper case items first?

Not looking for a work around. Looking to understand why Python sorts this way.

>>> a = ['aaa','Bbb']
>>> a.sort()
>>> print(a)
['Bbb', 'aaa']

>>> a = ['aaa','bbb']
>>> a.sort()
>>> print(a)
['aaa', 'bbb']

Upvotes: 8

Views: 6508

Answers (3)

ShadowRanger
ShadowRanger

Reputation: 155536

str is sorted based on the raw byte values (Python 2) or Unicode ordinal values (Python 3); in ASCII and Unicode, all capital letters have lower values than all lowercase letters, so they sort before them:

>>> ord('A'), ord('Z')
(65, 90)
>>> ord('a'), ord('z')
(97, 112)

Some locales (e.g. en_US) will change this sort ordering; if you pass locale.strxfrm as the key function, you'll get case-insensitive sorts on those locales, e.g.

>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, 'en_US.utf-8')
>>> a.sort(key=locale.strxfrm)
>>> a
['aaa', 'Bbb']

Upvotes: 6

farstop
farstop

Reputation: 11

Python treats uppercase letters as lower than lowercase letters. If you want to sort ignoring the case sensitivity. You can do something like this:

a = ['aaa','Bbb']
a.sort(key=str.lower)
print(a)

Outputs:
['aaa', 'Bbb']

Which ignores the case sensitivity. The key parameter "str.lower" is what allows you to do this. The following documentation should help. https://docs.python.org/3/howto/sorting.html

Upvotes: 1

mrid
mrid

Reputation: 5796

This is because upper case chars have an ASCII value lower than that of lower case. And hence if we sort them in increasing order, the upper case will come before the lower case

  • ASCII of A is 65
  • ASCII of a is 97

65<97

And hence A < a if you sort in increasing order

Upvotes: 11

Related Questions