Reputation: 6568
I have a dictionary containing sentences that is keyed by the book
and the page
from which it came from:
# lists to build dictionary - for reproducibility
pages = [12, 41, 50, 111, 1021, 121]
bookCodes = ['M', 'P', 'A', 'C', 'A', 'M']
sentences = ['THISISASENTANCE',
'ANDHEREISONEMOREEXAMP',
'ALLFROMDIFFERENTBOOKS',
'ANDFROMDIFFERENTPAGES',
'MOSLTYTHESAMELENGTHSS',
'BUTSOMEWILLBABITSHORT'
]
# Make dictionary
coordinates = defaultdict(dict)
for i in range(len(pages)):
book = bookCodes[i]
page = pages[i]
sentence = sentences[i]
coordinates[book][page] = sentence
print coordinates
defaultdict(<type 'dict'>, {'A': {50: 'ALLFROMDIFFERENTBOOKS', 1021: 'MOSLTYTHESAMELENGTHSS'}, 'P': {41: 'ANDHEREISONEMOREEXAMP'}, 'C': {111: 'ANDFROMDIFFERENTPAGES'}, 'M': {121: 'BUTSOMEWILLBABITSHORT', 12: 'THISISASENTANCE'}})
I also have a pool of vowels stored as a dictionary, so that each vowel starts with a count of 10:
vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
I want to iterate over the same element of each sentence (sentence[0][0]. sentence[n][0], ...
) and each time I see a vowel (A
, E
, I
, O
, or U
) reduce the count of this vowel from the vowels
dictionary.
Once a pool of vowels hits 0
I return the letter
, position
in the sentence, and sentence
and break the loop.
from collections import defaultdict
import random
def wordStopper(sentences):
random.shuffle(sentences)
vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
for i in range(len(sentences[1])):
for s in sentences:
try:
l = s[i:i + 1]
except IndexError:
continue
if l in vowels:
vowels[l] -= 1
print("Pos: %s, Letter: %s, Sentence: %s" % (i, l, s))
print("As = %s, Es = %s, Is = %s, Os = %s, Us = %s" %(vowels['A'], vowels['E'], vowels['I'], vowels['O'], vowels['U']))
if vowels[l] == 0:
return(l, i, s)
letter, location, sentence = wordStopper(sentences)
print("Vowel %s exhausted here %s in sentence: %s" % (letter, location, sentence))
It's important that the sentences
list is shuffled (and that I iterate over element 0
in all sentences, and then element 1
), so that I don't bias towards earlier entries in the sentences
list.
This works as I expect, but I now want to retrieve the book
and page
number from which the sentence
was pulled which are stored in coordinates
.
I can crudely achieve this by iterating over coordinates
and finding the sentence
that is returned from wordStopper
:
print coordinates
for book in coordinates.keys():
for page, s in coordinates[book].iteritems():
if s == sentence:
print("Book:%s, page: %s, position: %s, vowel: %s, sentence: %s" % (book, page, location, letter, sentence))
However This strikes me as a fairly poor way of achieving this.
Normally, I might iterate over the keys of coordinates
before the sentences, but I can't see a way to do this so that it doesn't bias the results towards the first keys that are iterated over.
Any suggestions are very welcome Note: this is toy example, so I'm not looking to use any corpus parsing tools
Upvotes: 0
Views: 76
Reputation: 10797
I think that what you need is a better data structure, that would let you to retrieve the book/page from the sentence. There are many possible designs. This is what I would do:
First, Create a data structure that holds a sentence, along with its book/page:
class SentenceWithMeta(object):
def __init__(self, sentence):
self.sentence = sentence
self.book = None
self.page = None
Then, hold all your sentences. For example:
sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]
At this point, initialize sentences_with_meta fields book and page fields:
# Make dictionary
sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]
for i in range(len(pages)):
book = bookCodes[i]
page = pages[i]
sentence_with_meta = sentences_with_meta[i]
sentence_with_meta.book = book
sentence_with_meta.page = page
Finally, in the wordStopper method, work with sentences_with_meta array, the following way:
def wordStopper(sentences):
random.shuffle(sentences_with_meta)
vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
for i in range(len(sentences[1])):
for swm in sentences_with_meta:
try:
l = swm.sentence[i:i + 1]
...
# the rest of the code is the same. You return swm, which has the book
# and page already in the structure.
Side node: To get letter i from a string, you don't need to use slice. Just use the index reference:
l = swm.sentence[i]
There are many many other designs that would work as well.
Upvotes: 1