Epoch
Epoch

Reputation: 49

Lexicographically sort a list of Strings without regard for case - Haskell

I have been trying to sort this list of strings for ten hours now. The ten hours was mostly spent fleshing out a workaround but fate is cruel and after all of that effort my workaround still requires me to sort a list of strings without regard for case.

An example of what I would like this function to do:

 Input: ["In", "Thats", "pLAIN"]
 Output: ["In", "pLAIN", "Thats"]

Basically I want to sort the strings as if they were all lowercase but maintain the case of the input

My code thus far:

--This function inserts the strings with capitals in them into the list of lower case only strings
--alphabetically
insertCapitals :: [String] -> [String] -> [String]
insertCapitals capitalList lowerList
--Base case
 |capitalList == [] && lowerList == [] = []
--If the capital list is empty then append the rest of the lowerList
 |capitalList == [] = (lowerListHead:insertCapitals capitalList tailedLowerList)
--If the lowerList is empty append the rest of the capitalList
 |lowerList == [] = (capitalListHead:insertCapitals tailedCapitalList lowerList)
--If the current lowercase word is less than the current uppercase word then append the lowercase word
 |compareResult == LT = (lowerListHead:insertCapitals capitalList tailedLowerList)
--If the current uppercase word is less than the current lowercase word then append the uppercase word
 |compareResult == GT = (capitalListHead:insertCapitals tailedCapitalList lowerList)

This code was part of my workaround attempt. It worked by taking the original list, breaking it into a list of the fully lowercase strings and a list of any of the strings that were left (all of these would have capitals). The problem with my code is that it requires both the list of lowercase strings and the list of all other strings to be in alphabetic order without regard to case.

Essentially my problem came full circle.

Thank you for any help in advance

Upvotes: 1

Views: 808

Answers (2)

Hjulle
Hjulle

Reputation: 2615

You could either do

sortBy (comparing (map toLower))

or equivalently, faster and shorter

sortOn (map toLower)

It does exactly what it says, "sort by comparing the lower case version of the strings".

See documentation here: https://hackage.haskell.org/package/base-4.14.1.0/docs/Data-List.html#v:sortOn

Upvotes: 5

DDub
DDub

Reputation: 3924

Basically I want to sort the strings as if they were all lowercase but maintain the case of the input

There's a neat trick for this: you pair up each actual value with the value you want to sort by. Then, you perform your sort, using the sort-by value for comparisons. Finally, once the list is sorted, you remove the sort-by data.

In your case, you can start by pairing each String in the list with a lowercase version of itself.

import Data.Char (toLower)

lexSort :: [String] -> [String]
lexSort xs = ... $ zip (fmap toLower <$> xs) xs

As per your example, if your input were ["In", "Thats", "pLAIN"], you now have a list that looks like [("in","In"),("thats","Thats"),("plain","pLAIN")]. What's next? Do the sort:

import Data.Char (toLower)
import Data.List (sort)

lexSort :: [String] -> [String]
lexSort xs = ... $ sort $ zip (fmap toLower <$> xs) xs

The Ord instance for pairs will compare first by the first element of the pair, and only if they are equal will it look at the second element. This is perfect in your case, because the first element is the lowercase value, which is what you want to sort on. So, at this point, you'll have [("in","In"),("plain","pLAIN"),("thats","Thats")].

Of course, if this is an assignment where you're not allowed to use the built-in sort, you'll have to write this part yourself, but at least you don't have to worry about case anymore.

And at the end, it's just a matter of getting rid of the lowercase stuff so the result is what you want. One more fmap:

lexSort xs = fmap snd $ sort $ zip (fmap toLower <$> xs) xs

Upvotes: 3

Related Questions