Reputation: 491
For fun I am trying to solve the Wordle related problem Matt Parker talks about in the linked video from his channel Standup Maths in Haskell. Basically: find 5, 5-letter words that do not share any letters in common. Thus using 25 different alphabet letters in total (for English).
It took Matt a month of computer time in Python. He freely admits his code was not at all optimized and he shows a viewer submitted program that solved it in 15 minutes on his laptop.
Here is my naive Haskell code, and note I tried to use parallel computation, but I am sure I don't know what I'm doing there because I'm only seeing one core active. I am not sure it will take a month to run, but after a few hours, my laptop was a toaster and I admitted I am not good at optimizing.
I borrowed an idea from https://www.youtube.com/watch?v=947Ewgue4DM where he encodes the words as 26 bit binary. However, I did not use his mechanism for reducing the size of the comparison lists where he skips ahead to the first possible match. I was not able to figure that out in Haskell.
Also, I did as Matt does and remove anagrams from the word list and then at the end, spit out all the anagrams a given binary encoding corresponds to.
Also, all of the impactful inefficiency is in the unique5Combos function. Meaning, the rest of the code is not efficient either, but it doesn't have much impact. I was able to get pairs of 5 letter words with no duplicate letters for the entire corpus in under 2 seconds. Getting 3 words took a lot longer (not sure how long, I went to get lunch and it was done when I returned), and getting 4 and 5 words unknown time to complete.
Lastly, I am running this on a 8 year old laptop with Arch Linux. So I'm not expecting blazing fast, but I am sure my simple code could be better at limiting the size of data to evaluate and evaluating faster.
I welcome any suggestions. Many thanks in advance.
module Main where
import Data.Bits ( Bits(testBit, (.&.), bit) )
import Data.Char ( ord, chr, intToDigit, toUpper )
import Numeric (showIntAtBase)
import Data.List ( nub, sort )
import Control.Monad ( guard )
import Control.Parallel.Strategies ( parListChunk, rdeepseq, using )
main :: IO ()
main = do
let fn = "/usr/share/dict/words"
w5 <- filter (\w -> length w == 5
&& nub w == w
&& all (\c -> c `elem` ['A'..'Z']) w)
<$> lines
<$> map toUpper
<$> readFile fn
let e5 = (sort . nub . map bitEncode) w5
let e25 = unique5Combos e5 `using` parListChunk 1000000 rdeepseq
let answer = (map . map) (\x -> anagrams (bitDecode x) w5) e25
print $ answer
unique5Combos :: [Int] -> [[Int]]
unique5Combos es = do
a <- es
b <- es
c <- es
d <- es
e <- es
let abcde = [a,b,c,d,e]
guard (isSorted abcde)
guard ((foldr1 (.&.) abcde) == 0)
return abcde
isSorted :: [Int] -> Bool
isSorted = and . (zipWith (<) <*> tail)
bitEncode :: [Char] -> Int
bitEncode xs = sum [bit (90 - ord x) | x <- xs]
bitDecode :: Bits a => a -> [Char]
bitDecode xs = [chr (90 - n) | n <- [25,24..0], testBit xs n]
anagrams :: [Char] -> [[Char]] -> [[Char]]
anagrams xs w5 = [w | w <- w5, (sort xs) == (sort w)]
Upvotes: 2
Views: 325