Reputation: 9028

Find 5 letter words with 25 distinct characters

Solving wordle efficiently (for humans and for computers) is all the rage right now.

One particular way of solving a wordle made me curious. The idea is to select 5 words that have distinct letters so you'll end up with 25 characters. If you use these 5 words as your first 5 guesses in the game, you'll have a close to 100% chance of getting the correct word in your last guess (it's essentially an anagram of all the clues and you'll probably have a few green ones). There is a set of words that is suggested (all of the words are valid English words):

brick
glent
jumpy
vozhd
waqfs

But this made me wonder: How many of these 5 word combinations are out there and I started whipping up a recursive algorithm but I am close to giving up.

My initial thought was:

Start with the first word
reduce overlapping words from the word list
pick the next remaining word in the word list
Repeat with the next word

But this only really works if you have a set of five distinct words in order.

For this list:

brick
feast
glent
jumpy
vozhd
waqfs

I will end up with: [brick, feast, jumpy, vozhd] because feast comes before glent and will filter it out but in the end glent would have been the better pick.

I wasn't able to find any algorithms for this specific problem so I was wondering if there is any existing algorithm that can be applied to this?

Upvotes: 12

Answers (3)

TullyYT

Reputation: 19

I'm currently working on the same thing. (Implemented In Python)

Here's My Code (Explanation Below):

import requests,string
wordlist = str(requests.get('https://gist.githubusercontent.com/dracos/dd0668f281e685bad51479e5acaadb93/raw/ca9018b32e963292473841fb55fd5a62176769b5/valid-wordle-words.txt').content).split('\\n');wordlist[0] = 'aahed'
alphabet = [str(_) for _ in string.ascii_lowercase]
for word in wordlist:
    if [char in word for char in alphabet].count(True) != 5:
        wordlist.remove(word)


for i in range(len(wordlist) ** 2):
    alphabet = [str(_) for _ in string.ascii_lowercase]
    currentwords=[]
    for word in wordlist:
        for char in word:
            alphabet.remove(char)
        currentwords.append(word)
        with open("out.txt", "a") as f:
            f.write(";".join(currentwords))
            f.write("\n")
    wordlist.pop(wordlist.index(currentwords[0]))

Basically, we load the wordlist, remove the duplicates:

for word in wordlist:
    if [char in word for char in alphabet].count(True) != 5:
        wordlist.remove(word)

then loop over the entire wordlist len of wordlist amount of times. we reset/initialize the variables. and loop over every word in the wordlist. we remove each character in the current word from the alphabet (aka left over letters) and add that word to the current words. we then output that to the out.txt file. after we finish the nested loop. we remove the word we just got (since we're done with it) and continue. this method will output combinations of any length (ie. 1,2,3,4,5) and is incredibly inefficient and is still in the making.

Please comment if you have any ideas for optimizing this!

Upvotes: 1

Ajay Sathe

Reputation: 27

My choice of 4 words: batch, field, wrong, musky Works very well for all forms of *ordles Can’t find a fifth word with the remaining letters, though.

Upvotes: -1

Paul Hankin

Reputation: 58339

It's possible to brute-force this. For efficiency, one can discard all words with duplicate letters, and pre-process the words to use a bitmask of which letters they have (there are 26 letters, so this fits in a 32-bit unsigned integer).

Then just do a depth-first search, maintaining a list of words (bitmasks) that don't intersect with the words found so far.

I've written some go code that does this. It uses a shortened list of words that just contains the solution words (the full wordlist is too long to include here), but the code runs in a few seconds even with the full list.

Because it uses bitmasks to represent words, it's possible that there's multiple words with the same letters in the solution. The program shows those with a | inbetween. There's just one pair: cylix|xylic in the solution:

bling treck waqfs jumpy vozhd
pling treck waqfs jumby vozhd
brick glent waqfs jumpy vozhd
kreng clipt waqfs jumby vozhd
fjord chunk vibex gymps waltz
fjord gucks vibex nymph waltz
prick glent waqfs jumby vozhd
kempt brung waqfs cylix|xylic vozhd
blunk waqfs cimex grypt vozhd
clunk waqfs bemix grypt vozhd

It can be run here: https://go.dev/play/p/wVEDjx3G1fE

package main

import (
    "fmt"
    "math/bits"
    "sort"
    "strings"
)

var allWords = []string{
    "bemix", "bling", "blunk", "brick", "brung", "chunk", "cimex", "clipt", "clunk", "cylix", "fjord", "glent", "grypt", "gucks", "gymps", "jumby", "jumpy", "kempt", "kreng", "nymph", "pling", "prick", "treck", "vibex", "vozhd", "waltz", "waqfs", "xylic",
}

func printSol(res []uint32, masks map[uint32][]string) {
    var b strings.Builder
    for i, r := range res {
        if i > 0 {
            b.WriteString(" ")
        }
        b.WriteString(strings.Join(masks[r], "|"))
    }
    fmt.Println(b.String())
}

func find5(w []uint32, mask uint32, n int, res []uint32, masks map[uint32][]string) {
    if n == 5 {
        printSol(res, masks)
        return
    }
    sub := []uint32{}
    for _, x := range w {
        if x&mask != 0 {
            continue
        }
        sub = append(sub, x)
    }
    for i, x := range sub {
        res[n] = x
        find5(sub[i+1:], mask|x, n+1, res, masks)
    }
}

func find5clique() {
    masks := map[uint32][]string{}
    for _, x := range allWords {
        m := uint32(0)
        for _, c := range x {
            m |= 1 << (c - 'a')
        }
        if bits.OnesCount32(m) == 5 {
            masks[m] = append(masks[m], x)
        }
    }
    maskSlice := []uint32{}
    for m := range masks {
        maskSlice = append(maskSlice, m)
    }
    sort.Slice(maskSlice, func(i, j int) bool {
        return maskSlice[i] < maskSlice[j]
    })
    find5(maskSlice, uint32(0), 0, make([]uint32, 5, 5), masks)
}

func main() {
    find5clique()
}

Upvotes: 13

Find 5 letter words with 25 distinct characters

Answers (3)

Related Questions