J_lll
J_lll

Reputation: 55

How to build a dictionary of words in text

How would I return a dictionary with the key being a word in the given text and the values being a list of previous words in the text?

e.g.

text = "hi my name is"    
get_previous_words_dict(text):

prints a dictionary:

>>> my_dict['hi']
[]
>>> my_dict['my']
['hi']    
>>> my_dict['name']
['hi', 'my']

Upvotes: 0

Views: 3723

Answers (4)

dawg
dawg

Reputation: 103844

>>> t="hi my name is"
>>> li=t.split()

You can use a dict comprehension:

>>> {w:[li[si] for si in range(i-1,-1,-1)] for i, w in enumerate(li)}
{'is': ['name', 'my', 'hi'], 'hi': [], 'my': ['hi'], 'name': ['my', 'hi']}

Or, counting up:

>>> {w:[li[si] for si in range(0,i)] for i, w in enumerate(li)}
{'is': ['hi', 'my', 'name'], 'hi': [], 'my': ['hi'], 'name': ['hi', 'my']}

Or use a slice instead of the nested list comprehension:

>>> {w:li[0:i] for i, w in enumerate(li)}
{'is': ['hi', 'my', 'name'], 'hi': [], 'my': ['hi'], 'name': ['hi', 'my']}

Upvotes: 1

Louisiana Hasek
Louisiana Hasek

Reputation: 445

This only makes sense if the words in the sentence are unique, as @cjds points out. Also, the value for the first word should surely be an empty list, not a list containing the empty string. The following will fit this specification:

def get_previous_words_dict(text):
    words = []
    dictionary = {}
    for word in text.split():
        dictionary[word] = words[:]
        words.append(word)
    return dictionary

The most important thing to understand is the assignment:

dictionary[word] = words[:]

The effect of this is to copy the words array. If it was a normal assignment:

dictionary[word] = words

Then that would just make each dictionary entry refer to the same words list, and so at the end of the loop every entry in the dictionary would have all of the words.

Upvotes: 1

Hesham Attia
Hesham Attia

Reputation: 977

  1. Split the sentence into words:

    sentence_words = sentence.split(' ')
    
  2. Create a dictionary where the key is the word, and the value is a slice of sentence_words from the beginning to the position of this word.

    d = {w: sentence_words[:i] for i, w in enumerate(sentence_words)}
    

Sample Code

sentence = "Hi my name is John"
sentence_words = sentence.split(' ')
d = {w: sentence_words[:i] for i, w in enumerate(sentence_words)}

Upvotes: 0

BugsBunny
BugsBunny

Reputation: 99

If I were to implement from scratch:

Use a hash to store words, this used as dictionary. When insert into hash, insert as key => [previous keys in hash].

Upvotes: 0

Related Questions