Samir
Samir

Reputation: 3197

Word count with pattern in Python

So this is the question:

Write a program to read in multiple lines of text and count the number of words in which the rule i before e, except after c is broken, and number of words which contain either ei or ie and which don't break the rule.

For this question, we only care about the c if it is the character immediately before the ie or the ei. So science counts as breaking the rule, but mischievous doesn't. If a word breaks the rule twice (like obeisancies), then it should still only be counted once.

Example given:

Line: The science heist succeeded
Line: challenge accepted
Line: 
Number of times the rule helped: 0
Number of times the rule was broken: 2

and my code:

rule = []
broken = []
line = None
while line != '':
    line = input('Line: ')

    line.replace('cie', 'broken')
    line.replace('cei', 'rule')
    line.replace('ie', 'rule')
    line.replace('ei', 'broken')

    a = line.count('rule')
    b = line.count('broken')

    rule.append(a)
    broken.append(b)

print(sum(a)); print(sum(b))

How do I fix my code, to work like the question wants it to?

Upvotes: 1

Views: 1614

Answers (5)

Alexis
Alexis

Reputation: 705

If I understand correctly, your main problematic is to get unique result per word. Is that what you try to achieve:

rule_count = 0
break_count = 0
line = None
while line != '':
    line = input('Line: ')
    rule_found = False
    break_found = False

    for word in line.split():
        if 'cie' in line:
            line = line.replace('cie', '')
            break_found = True
        if 'cei' in line:
            line = line.replace('cei', '')
            rule_found = True
        if 'ie' in line:
            rule_found = True
        if 'ei' in line:
            break_found = True

        if rule_found:
            rule_count += 1
        if break_found:
            break_count += 1

print(rule_found); print(break_count)

Upvotes: 1

flornquake
flornquake

Reputation: 3396

Let's split the logic up into functions, that should help us reason about the code and get it right. To loop over the line, we can use the iter function:

def rule_applies(word):
    return 'ei' in word or 'ie' in word

def complies_with_rule(word):
    if 'cie' in word:
        return False
    if word.count('ei') > word.count('cei'):
        return False
    return True

helped_count = 0
broken_count = 0
lines = iter(lambda: input("Line: "), '')
for line in lines:
    for word in line.split():
        if rule_applies(word):
            if complies_with_rule(word):
                helped_count += 1
            else:
                broken_count += 1

print("Number of times the rule helped:", helped_count)
print("Number of times the rule was broken:", broken_count)

We can make the code more concise by shortening the complies_with_rule function and by using generator expressions and Counter:

from collections import Counter

def rule_applies(word):
    return 'ei' in word or 'ie' in word

def complies_with_rule(word):
    return 'cie' not in word and word.count('ei') == word.count('cei')

lines = iter(lambda: input("Line: "), '')
words = (word for line in lines for word in line.split())
words_considered = (word for word in words if rule_applies(word))
did_rule_help_count = Counter(complies_with_rule(word) for word in words_considered)

print("Number of times the rule helped:", did_rule_help_count[True])
print("Number of times the rule was broken:", did_rule_help_count[False])

Upvotes: 1

Samir
Samir

Reputation: 3197

rule = []
broken = []
tb = 0
tr = 0
line = ' '
while line:
    lines = input('Line: ')
    line = lines.split()


    for word in line:

        if 'ie' in word:
            if 'cie' in word:
                tb += 1
            elif word.count('cie') > 1:
                tb += 1

            elif word.count('ie') > 1:
                tr += 1
            elif 'ie' in word:
                tr += 1

        if 'ei' in word:
            if 'cei' in word:
                tr += 1
            elif word.count('cei') > 1:
                tr += 1

            elif word.count('ei') > 1:
                tb += 1
            elif 'ei' in word:
                tb += 1




print('Number of times the rule helped: {0}'.format(tr))
print('Number of times the rule was broken: {0}'.format(tb))

Done.

Upvotes: 0

Sheena
Sheena

Reputation: 16242

Firstly, replace does not chance stuff in place. What you need is the return value:

line = 'hello there'                     # line = 'hello there'
line.replace('there','bob')              # line = 'hello there'
line = line.replace('there','bob')       # line = 'hello bob'

Also I would assume you want actual totals so:

print('Number of times the rule helped: {0}'.format(sum(rule)))
print('Number of times the rule was broken: {0}'.format(sum(broken)))

You are printing a and b. These are the numbers of times the rule worked and was broken in the last line processed. You want totals.

As a sidenote: Regular expressions are good for things like this. re.findall would make this a lot more sturdy and pretty:

line = 'foo moo goo loo foobar cheese is great '
foo_matches = len(re.findall('foo', line))   # = 2

Upvotes: 1

Chris Seymour
Chris Seymour

Reputation: 85873

I'm not going to write the code to your exact specification as it sounds like homework but this should help:

import pprint

words = ['science', 'believe', 'die', 'friend', 'ceiling',
         'receipt', 'seize', 'weird', 'vein', 'foreign']

rule = {}
rule['ie'] = []
rule['ei'] = []
rule['cei'] = []
rule['cie'] = []

for word in words:
    if 'ie' in word:
        if 'cie' in word:
            rule['cie'].append(word)
        else:
            rule['ie'].append(word)
    if 'ei' in word:
        if 'cei' in word:
            rule['cei'].append(word)
        else:
            rule['ei'].append(word)

pprint.pprint(rule)

Save it to a file like i_before_e.py and run python i_before_e.py:

{'cei': ['ceiling', 'receipt'],
 'cie': ['science'],
 'ei': ['seize', 'weird', 'vein', 'foreign'],
 'ie': ['believe', 'die', 'friend']}

You can easily count the occurrences with:

for key in rule.keys():
    print "%s occured %d times." % (key, len(rule[key])) 

Output:

ei occured 4 times.
ie occured 3 times.
cie occured 1 times.
cei occured 2 times.

Upvotes: 1

Related Questions