fugu
fugu

Reputation: 6568

Iterate over elements of strings at the same time

I have a dictionary containing sentences that is keyed by the book and the page from which it came from:

# lists to build dictionary - for reproducibility  
pages     = [12, 41, 50, 111, 1021, 121]
bookCodes = ['M', 'P', 'A', 'C', 'A', 'M']

sentences = ['THISISASENTANCE',
             'ANDHEREISONEMOREEXAMP',
             'ALLFROMDIFFERENTBOOKS',
             'ANDFROMDIFFERENTPAGES',
             'MOSLTYTHESAMELENGTHSS',
             'BUTSOMEWILLBABITSHORT'
             ]

# Make dictionary 
coordinates = defaultdict(dict)
for i in range(len(pages)):
    book = bookCodes[i]
    page = pages[i]
    sentence = sentences[i]
    coordinates[book][page] = sentence 

print coordinates

defaultdict(<type 'dict'>, {'A': {50: 'ALLFROMDIFFERENTBOOKS', 1021: 'MOSLTYTHESAMELENGTHSS'}, 'P': {41: 'ANDHEREISONEMOREEXAMP'}, 'C': {111: 'ANDFROMDIFFERENTPAGES'}, 'M': {121: 'BUTSOMEWILLBABITSHORT', 12: 'THISISASENTANCE'}})

I also have a pool of vowels stored as a dictionary, so that each vowel starts with a count of 10:

vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)

I want to iterate over the same element of each sentence (sentence[0][0]. sentence[n][0], ...) and each time I see a vowel (A, E, I, O, or U) reduce the count of this vowel from the vowels dictionary.

Once a pool of vowels hits 0 I return the letter, position in the sentence, and sentence and break the loop.

from collections import defaultdict
import random

def wordStopper(sentences):
    random.shuffle(sentences)
    vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
    for i in range(len(sentences[1])):
        for s in sentences:
            try:
                l = s[i:i + 1]
            except IndexError:
                continue

            if l in vowels:
                vowels[l] -= 1
                print("Pos: %s, Letter: %s, Sentence: %s" % (i, l, s))
            print("As = %s, Es = %s, Is = %s, Os = %s, Us = %s" %(vowels['A'], vowels['E'], vowels['I'], vowels['O'],  vowels['U']))
            if vowels[l] == 0:
                return(l, i, s)

letter, location, sentence = wordStopper(sentences)
print("Vowel %s exhausted here %s in sentence: %s" % (letter, location, sentence))

It's important that the sentences list is shuffled (and that I iterate over element 0 in all sentences, and then element 1), so that I don't bias towards earlier entries in the sentences list.

This works as I expect, but I now want to retrieve the book and page number from which the sentence was pulled which are stored in coordinates.

I can crudely achieve this by iterating over coordinates and finding the sentence that is returned from wordStopper:

print coordinates

for book in coordinates.keys():
    for page, s in coordinates[book].iteritems():
        if s == sentence:
            print("Book:%s, page: %s, position: %s, vowel: %s, sentence: %s" % (book, page, location, letter, sentence))

However This strikes me as a fairly poor way of achieving this.

Normally, I might iterate over the keys of coordinates before the sentences, but I can't see a way to do this so that it doesn't bias the results towards the first keys that are iterated over.

Any suggestions are very welcome Note: this is toy example, so I'm not looking to use any corpus parsing tools

Upvotes: 0

Views: 76

Answers (1)

Uri London
Uri London

Reputation: 10797

I think that what you need is a better data structure, that would let you to retrieve the book/page from the sentence. There are many possible designs. This is what I would do:

First, Create a data structure that holds a sentence, along with its book/page:

class SentenceWithMeta(object):
    def __init__(self, sentence):
        self.sentence = sentence
        self.book = None
        self.page = None

Then, hold all your sentences. For example:

sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]

At this point, initialize sentences_with_meta fields book and page fields:

# Make dictionary
sentences_with_meta = [SentenceWithMeta(sentence) for sentence in sentences]
for i in range(len(pages)):
    book = bookCodes[i]
    page = pages[i]
    sentence_with_meta = sentences_with_meta[i]
    sentence_with_meta.book = book
    sentence_with_meta.page = page

Finally, in the wordStopper method, work with sentences_with_meta array, the following way:

def wordStopper(sentences):
    random.shuffle(sentences_with_meta)
    vowels = dict.fromkeys(['A', 'E', 'I', 'O', 'U'], 10)
    for i in range(len(sentences[1])):
        for swm in sentences_with_meta:
            try:
                l = swm.sentence[i:i + 1]
    ...
    # the rest of the code is the same. You return swm, which has the book
    # and page already in the structure.

Side node: To get letter i from a string, you don't need to use slice. Just use the index reference:

l = swm.sentence[i]

There are many many other designs that would work as well.

Upvotes: 1

Related Questions