Paul Mathew
Paul Mathew

Reputation: 25

Sorting based on character code values on a dataframe column

Below is the first column(0th) from a DF Need to sort it based on character code values(Digits are the lowest value characters and followed by uppercase letter, followed by lowercase letters)

tried with sort_values() and sorted() but no luck


USOM042000D89UPC0001
USOM021007D19UPC0001
USOM04200AA0FUPC0001
063899500430
USOM00330AD6DUPC0001
USOM0030828C0UPC0001
USOM043002C05UPC0002
USOM042004F74UPC0001
068542802075

Upvotes: 0

Views: 98

Answers (1)

Andrew Eckart
Andrew Eckart

Reputation: 1726

You can achieve this by mapping the built-in ord function over each character:

>>> vals = """USOM042000D89UPC0001
... USOM021007D19UPC0001
... USOM04200AA0FUPC0001
... 063899500430
... USOM00330AD6DUPC0001
... USOM0030828C0UPC0001
... USOM043002C05UPC0002
... USOM042004F74UPC0001
... 068542802075""".splitlines()
>>> vals
['USOM042000D89UPC0001', 'USOM021007D19UPC0001', 'USOM04200AA0FUPC0001', '063899500430', 'USOM00330AD6DUPC0001', 'USOM0030828C0UPC0001', 'USOM043002C05UPC0002', 'USOM042004F74UPC0001', '068542802075']
>>> sorted(vals, key=lambda x: tuple(ord(c) for c in x))
['063899500430', '068542802075', 'USOM0030828C0UPC0001', 'USOM00330AD6DUPC0001', 'USOM021007D19UPC0001', 'USOM042000D89UPC0001', 'USOM042004F74UPC0001', 'USOM04200AA0FUPC0001', 'USOM043002C05UPC0002']

Upvotes: 1

Related Questions