Jeff Burka
Jeff Burka

Reputation: 2571

What's the most efficient way to sort a Text value?

If I have a Data.Text value that I want to sort, should I just unpack it to a String and use sort or some other function on it? It seems like it would be tough to write a fast sorting function for Text values when cons and append are both O(n).

Upvotes: 1

Views: 399

Answers (1)

J. Abrahamson
J. Abrahamson

Reputation: 74354

Depends on what you mean by "sorting" Text. Much of the value of the Text type is tied up in it handling weird human language inconsistencies correctly. Probably the best way to sort while taking those variants into account is to use the text-icu

import           Data.Text.ICU
import qualified Data.Text     as T

-- | Uses the Unicode Collation Algorithm. Others can be chosen by picking 
-- something other than `uca` as your Collator.
sortText :: [T.Text] -> [T.Text]
sortText = sortBy (sortKey uca)

If you do what you suggested in your question—unpack to a string then compare strings by lexicographic character order—you'll possibly be slower (String is a much bulkier type than Text) but you'll certainly open up possibilities of weird sort orders and weird re-packing if you have Unicode Text values.

Upvotes: 6

Related Questions