How to create a dictionary of word parts from the words that start and end with the same letter

Question

I am trying to write a script - read_dict(dictionary) - that takes a .txt file as argument and gives a dictionary of word parts of every word in the file. The parts are the first and last letter of every word, and the remaining letters. For example if the file is as following:

===dictionary.txt===
quack  qk
quick qk
going gg
gathering gg
quirk qk
quicken qn

output should be :

{ 'qk' : {'uac', 'uic'}, 'gg' : {'oin', 'atherin'}, 'qn' : {'uicke' }}

I wrote this:

def outside(word):
    a = word.strip()[0]
    b = word.strip()[-1]
    out_word = a + b
    return out_word


def inside(word):
    a = word.strip()[1:-1]
    return a


def read_dict(dictionary):
    a = {}
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            a[outside(i)] = inside(i)
    return a

But my output is:

{ 'qk' : 'uac', 'gg' : 'oin', 'qn' : 'uicke'}

It only saves the first words. I also couldn't find a way to gather all the inside(word)s with the same letters outside in a set, then adding them to a dictionary with the appropriate key such as 'qk'.

Ma0 · Accepted Answer

As @Ch3steR says, this can be easily achieved with collections.defaultdict. Modify your code to this:

from collections import defaultdict

def read_dict(dictionary):
    a = defaultdict(set)
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            a[outside(i)].add(inside(i))
    return a

If you do not want to use any external libraries, you can do:

def read_dict(dictionary):
    a = {}
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            key = outside(i)
            if key in a:
                a[key].add(inside(i))
            else:
                a[key] = {inside(i)}
    return a

By comparing the two code snippets you also get an idea what collections.defaultdict does and how it allows you to write less code.

How to create a dictionary of word parts from the words that start and end with the same letter

Answers (2)

Related Questions