thomascrha
thomascrha

Reputation: 323

Sorting sets of non-latin characters in the order of a string?

I am using the following code form sorting:

letters = '세븐일레븐'
old = [('세븐', 8), ('븐', 2), ('일', 5), ('레', 4)]
new = sorted(old, key=lambda x: letters.index(x[0]))

For non-latin characters, the output is the same as the input:

[('세븐', 8), ('븐', 2), ('일', 5), ('레', 4)]

What I'm expecting is:

[('세븐', 8), ('일', 5), ('레', 4), ('븐', 2)]

Upvotes: 0

Views: 243

Answers (2)

ShadowRanger
ShadowRanger

Reputation: 155323

Why do you expect '일' to sort before '븐'? '븐' is the second character in letters; index is going to return the first instance it finds.

If the goal is to treat specific sequences differently, you need to define letters as a list of the complete strings you care about, not a single flat str, e.g.:

letters = ['세븐', '일', '레', '븐']

Then the index call will treat '세븐' as separate from '븐', and you get the expected output ordering.

Upvotes: 1

matanso
matanso

Reputation: 1292

There is no problem with the sorting. Notice that the letter '븐' appears twice in your letters string. Since index returns the first index of that letter, letters.index('븐') evaluates to 1, which gives it a high priority.

Upvotes: 1

Related Questions