Lizzy
Lizzy

Reputation: 211

All Five letter words with three character repeats?

I created 5 for loops that go through each letter in the alphabet and then add those all into a list. I then take that list and for each letter I check if that letter has 3 character repeats and then break. However, this is obviously extremely inefficient and my computer is taking very long to return a result. Does anyone have any better ideas on how I can do this and improve the runtime? Examples would be : cccde, yxxxz btw i can't import other libraries other than math, random, string

#generate all possible 5 letter combinations without repeats 
    for one in alpha:
        for two in alpha:
            for three in alpha:
                for four in alpha: 
                    for five in alpha:
                        key = str(one+two+three+four+five)
                        if (key in list1) == False:
                            list1.append(key)

    #filter down to ones with three repeated letters:     
    for word in list1:
        for letter in word: 
            if word.count(letter) == 3:
                list2.append(word)
                break```

Upvotes: 2

Views: 6368

Answers (7)

blhsing
blhsing

Reputation: 106995

You can use itertools.combinations to pick 2 out of 5 indices to insert non-repeating alphabets into, and use itertools.permutations to pick 3 alphabets out of alpha for permutations, the first alphabet of which is to be repeated 3 times, and use itertools.product to combine the combinations and permutations into one sequence.

Note that since you have edited your question to not allow import of itertools, you can simply copy and paste the equivalent code (as shown in the documentation) inline to avoid imports.

from itertools import combinations, permutations, product

output = []
for insert_at, (repeated, *inserted) in product(
        combinations(range(5), 2), permutations(alpha, r=3)):
    output.append(''.join(inserted.pop() if i in insert_at else repeated for i in range(5)))

so that given alpha = 'abcd', output would become:

['cbaaa',
 'dbaaa',
 'bcaaa',
 'dcaaa',
 'bdaaa',
 'cdaaa',
 'cabbb',
 'dabbb',
 'acbbb',
 'dcbbb',
 'adbbb',
 'cdbbb',
 'baccc',
 'daccc',
 'abccc',
 'dbccc',
 'adccc',
 'bdccc',
 'baddd',
 'caddd',
 'abddd',
 'cbddd',
 'acddd',
 'bcddd',
 'cabaa',
 'dabaa',
 'bacaa',
 'dacaa',
 'badaa',
 'cadaa',
 'cbabb',
 'dbabb',
 'abcbb',
 'dbcbb',
 'abdbb',
 'cbdbb',
 'bcacc',
 'dcacc',
 'acbcc',
 'dcbcc',
 'acdcc',
 'bcdcc',
 'bdadd',
 'cdadd',
 'adbdd',
 'cdbdd',
 'adcdd',
 'bdcdd',
 'caaba',
 'daaba',
 'baaca',
 'daaca',
 'baada',
 'caada',
 'cbbab',
 'dbbab',
 'abbcb',
 'dbbcb',
 'abbdb',
 'cbbdb',
 'bccac',
 'dccac',
 'accbc',
 'dccbc',
 'accdc',
 'bccdc',
 'bddad',
 'cddad',
 'addbd',
 'cddbd',
 'addcd',
 'bddcd',
 'caaab',
 'daaab',
 'baaac',
 'daaac',
 'baaad',
 'caaad',
 'cbbba',
 'dbbba',
 'abbbc',
 'dbbbc',
 'abbbd',
 'cbbbd',
 'bccca',
 'dccca',
 'acccb',
 'dcccb',
 'acccd',
 'bcccd',
 'bddda',
 'cddda',
 'adddb',
 'cdddb',
 'adddc',
 'bdddc',
 'acbaa',
 'adbaa',
 'abcaa',
 'adcaa',
 'abdaa',
 'acdaa',
 'bcabb',
 'bdabb',
 'bacbb',
 'bdcbb',
 'badbb',
 'bcdbb',
 'cbacc',
 'cdacc',
 'cabcc',
 'cdbcc',
 'cadcc',
 'cbdcc',
 'dbadd',
 'dcadd',
 'dabdd',
 'dcbdd',
 'dacdd',
 'dbcdd',
 'acaba',
 'adaba',
 'abaca',
 'adaca',
 'abada',
 'acada',
 'bcbab',
 'bdbab',
 'babcb',
 'bdbcb',
 'babdb',
 'bcbdb',
 'cbcac',
 'cdcac',
 'cacbc',
 'cdcbc',
 'cacdc',
 'cbcdc',
 'dbdad',
 'dcdad',
 'dadbd',
 'dcdbd',
 'dadcd',
 'dbdcd',
 'acaab',
 'adaab',
 'abaac',
 'adaac',
 'abaad',
 'acaad',
 'bcbba',
 'bdbba',
 'babbc',
 'bdbbc',
 'babbd',
 'bcbbd',
 'cbcca',
 'cdcca',
 'caccb',
 'cdccb',
 'caccd',
 'cbccd',
 'dbdda',
 'dcdda',
 'daddb',
 'dcddb',
 'daddc',
 'dbddc',
 'aacba',
 'aadba',
 'aabca',
 'aadca',
 'aabda',
 'aacda',
 'bbcab',
 'bbdab',
 'bbacb',
 'bbdcb',
 'bbadb',
 'bbcdb',
 'ccbac',
 'ccdac',
 'ccabc',
 'ccdbc',
 'ccadc',
 'ccbdc',
 'ddbad',
 'ddcad',
 'ddabd',
 'ddcbd',
 'ddacd',
 'ddbcd',
 'aacab',
 'aadab',
 'aabac',
 'aadac',
 'aabad',
 'aacad',
 'bbcba',
 'bbdba',
 'bbabc',
 'bbdbc',
 'bbabd',
 'bbcbd',
 'ccbca',
 'ccdca',
 'ccacb',
 'ccdcb',
 'ccacd',
 'ccbcd',
 'ddbda',
 'ddcda',
 'ddadb',
 'ddcdb',
 'ddadc',
 'ddbdc',
 'aaacb',
 'aaadb',
 'aaabc',
 'aaadc',
 'aaabd',
 'aaacd',
 'bbbca',
 'bbbda',
 'bbbac',
 'bbbdc',
 'bbbad',
 'bbbcd',
 'cccba',
 'cccda',
 'cccab',
 'cccdb',
 'cccad',
 'cccbd',
 'dddba',
 'dddca',
 'dddab',
 'dddcb',
 'dddac',
 'dddbc']

Upvotes: 0

JohanC
JohanC

Reputation: 80449

All valid strings only have at most 3 different letters. Either the first is repeated, or the second, or the third. We need some test to prevent that a string that was generated earlier will be added another time:

alpha = list('abcdefghijklmnopqrstuvwxyz')

l = []
for one in alpha:
    for two in alpha:
        for three in alpha:
            l.append(one+one+one+two+three)
            if one != two:
                l.append(one+two+two+two+three)
            if two != three:
                l.append(one+two+three+three+three)
print(len(l), l)

If you interpret the question such that the repeated letters don't need to be consecutive, the program probably would change to:

alpha = list('abcdefghijklmnopqrstuvwxyz')
l = []
for one in alpha:
    for two in alpha:
        for three in alpha:
            l.append(one+one+one+two+three)
            if one != two:
                l.append(one + one + two + one + three)
                l.append(one + two + one + one + three)
                l.append(two + one + one + one + three)
                if one != three:
                    l.append(one + one + two + three + one)
                    l.append(one + two + one + three + one)
                    l.append(one + two + three + one + one)
                    l.append(two + one + one + three + one)
                    l.append(two + one + three + one + one)
                    l.append(two + three + one + one + one)
print(len(l))
l = set(l)
print(len(l))

This prints two times 165776. Of course just leaving out the duplication tests and converting to a set is simpler, but less fun.

The code can be easily adapted to not allow for 4 or 5 times repeated letters just by having one surrounding if-test to check one != two and one != three. In that case there are 162500 solutions.

165776 is also the count of @AlexHall if the code is changed to allow for 3 or four times repetition. And 162500 for the case of exactly 3 repetitions.

For completeness: the case of at least three consecutive letters has 51376 solutions. And the case of exactly three consecutive letters has 50700 solutions.

Upvotes: 2

maurera
maurera

Reputation: 1679

Here's an attempt at it with only 4 lines:

# Start with set of all letters
s = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}

# Get all combinations of 5 letters
allcombinations = [(a,b,c,d,e) for a in s for b in s for c in s for d in s for e in s]

# Transform to list
biglist = list(allcombinations)

# Filter for only those that have 3 or more repeats and flatten
result = [a for x in s for a in list(filter(lambda y:y.count(x)==3, biglist))]

Here's a timing test (to avoid waiting forever, I've scaled down the number of loops to 3 and matches to 2 in the timing test):

# My version
def v1():
    s = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'}
    allcombinations = [(a,b,c) for a in s for b in s for c in s]
    biglist = list(allcombinations)
    result = [a for x in s for a in list(filter(lambda y:y.count(x)==2, biglist))]
    return result

# OP's version
def v2():
    # generate all possible 5 letter combinations without repeats
    alpha = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
         'w', 'x', 'y', 'z'}
    list1 = []
    for one in alpha:
        for two in alpha:
            for three in alpha:
                key = str(one + two + three)
                if (key in list1) == False:
                    list1.append(key)

    # filter down to ones with three repeated letters:
    list2 = []
    for word in list1:
        for letter in word:
            if word.count(letter) == 2:
                list2.append(word)
                break
    return list2

import time
start_time = time.time()
res1 = v1()
print(f'My version: {len(res1)} entries')
print(f'--- {time.time() - start_time} seconds ---')

start_time = time.time()
res2 = v2()
print(f'OP\'s version: {len(res2)} entries')
print(f'--- {time.time() - start_time} seconds ---')

Which outputs:

My version: 1950 entries
--- 0.06479763984680176 seconds ---
OP's version: 1950 entries
--- 1.5608229637145996 seconds ---

Upvotes: -1

Aliakbar Saleh
Aliakbar Saleh

Reputation: 649

I have a suggestion with only one for loop, which is reasonably fast on my PC. First you need cartesian product of 5 set (mathematical set) for creating all 5 letter words with 3 repeated charachters. After that you can generate all of permutations of these 5 letter words to generate new words. For cleaning repeated words, you can map the output of previous step to set (deletes the repeated items).

from itertools import product, permutations
import string

letters = list(string.ascii_lowercase)
result = []
for repeated in string.ascii_lowercase:
    letters.remove(repeated)

    five_letter_repeated_at_end = product(letters, letters, [repeated], [repeated], [repeated])
    five_letter_words = map(lambda word: set(permutations(word)), five_letter_repeated_at_end)
    current_letter_words = []
    for permutes in five_letter_words:
        permutes = list(map(''.join, permutes))
        current_letter_words.extend(list(permutes))
    result.extend(current_letter_words)

    letters.append(repeated)
print(result)

Upvotes: 0

Alex Hall
Alex Hall

Reputation: 36043

The biggest problem with your current algorithm is checking if (key in list1) == False:. There is no reason to do that since there would never be any duplicates in your loop and it makes your algorithm O(n^2) instead of O(n) where n is the size of the list. This code runs in reasonable time:

import string

alpha = string.ascii_lowercase
list1 = []

for one in alpha:
    for two in alpha:
        for three in alpha:
            for four in alpha:
                for five in alpha:
                    word = one + two + three + four + five
                    for letter in word:
                        if word.count(letter) == 3:
                            list1.append(word)
                            break

print(len(list1))

Upvotes: 1

mathfux
mathfux

Reputation: 5949

Beloved itertools too (based on Prune's method):

from itertools import product, chain
def grow(n):
    '''takes word of three letters and grows another three words'''
    #example: abc -> [aaabc, abbbc, abccc]
    branch1 = n[0]*3+n[1]+n[2]
    branch2 = n[0]+n[1]*3+n[2]
    branch3 = n[0]+n[1]+n[2]*3
    return branch1, branch2, branch3

triple_combinations_generator = product(alpha, repeat=3)
triple_combinations = map(lambda x: x[0]+x[1]+x[2], triple_combinations_generator)
grown_triple_combinations= map(lambda x: grow(x), triple_combinations)
merged_triples = chain(*grown_triple_combinations)
result = set(merged_triples)
print(result)

Upvotes: 0

Prune
Prune

Reputation: 77860

  • Generate all 3-letter strings using itertools.product on the alphabet, with a repetition of 3.
  • Iterate through those permutations; for each one, triple each letter, giving three different 5-letter solutions. For instance, from abc, you will generate aaabc, abbbc, and abccc.
  • Remove the duplicates: 3-letter words with doubled (or tripled) letters will produce duplicates. For instance, aab will generate aaaab in two different ways.

Is that enough hint?

Upvotes: 2

Related Questions