arty
arty

Reputation: 669

How to create a dictionary of word parts from the words that start and end with the same letter

I am trying to write a script - read_dict(dictionary) - that takes a .txt file as argument and gives a dictionary of word parts of every word in the file. The parts are the first and last letter of every word, and the remaining letters. For example if the file is as following:

===dictionary.txt===
quack  qk
quick qk
going gg
gathering gg
quirk qk
quicken qn

output should be :

{ 'qk' : {'uac', 'uic'}, 'gg' : {'oin', 'atherin'}, 'qn' : {'uicke' }}

I wrote this:

def outside(word):
    a = word.strip()[0]
    b = word.strip()[-1]
    out_word = a + b
    return out_word


def inside(word):
    a = word.strip()[1:-1]
    return a


def read_dict(dictionary):
    a = {}
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            a[outside(i)] = inside(i)
    return a

But my output is:

{ 'qk' : 'uac', 'gg' : 'oin', 'qn' : 'uicke'}

It only saves the first words. I also couldn't find a way to gather all the inside(word)s with the same letters outside in a set, then adding them to a dictionary with the appropriate key such as 'qk'.

Upvotes: 0

Views: 69

Answers (2)

gen_Eric
gen_Eric

Reputation: 227310

You need to make a[outside(i)] a list and append each new item to it, instead of just overwriting it each time you find a new one.

Also, why do you grab the 1st and last letters of the word, when you already have those in the file for you?

def read_dict(dictionary):
    a = {}

    with open(dictionary, 'r') as text:
        data = text.readlines()
        value, key = data.split(' ')

        if key not in a:
            a[key] = []

        a[key].append(value[1:-1])

    return a

Upvotes: 2

Ma0
Ma0

Reputation: 15204

As @Ch3steR says, this can be easily achieved with collections.defaultdict. Modify your code to this:

from collections import defaultdict

def read_dict(dictionary):
    a = defaultdict(set)
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            a[outside(i)].add(inside(i))
    return a

If you do not want to use any external libraries, you can do:

def read_dict(dictionary):
    a = {}
    with open(dictionary, 'r') as text:
        data = text.readlines()
        for i in data:
            key = outside(i)
            if key in a:
                a[key].add(inside(i))
            else:
                a[key] = {inside(i)}
    return a

By comparing the two code snippets you also get an idea what collections.defaultdict does and how it allows you to write less code.

Upvotes: 3

Related Questions