BRose
BRose

Reputation: 3

Python altering list item in iteration

I am trying to get this python code to get rid of punctuation marks associated with words and count the unique words. For some reason it's still counting both "hello." and "hello". Any help would be most appreciated.

def word_distribution(words):
            word_dict = {}
            words = words.lower()
            words = words.split()
            for word in words:
                if ord('a') <= ord(word[-1]) <= ord('z'):
                    pass
                elif ord('A') <= ord(word[-1]) <= ord('Z'):
                    pass
                else: 
                    word[:-1]
            word_dict = {word:words.count(word)+1 for word in set(words)}
            return(word_dict)

Upvotes: 0

Views: 86

Answers (3)

tgikal
tgikal

Reputation: 1680

I don't know why you're adding 1 to count.

def word_distribution(words):
        word_dict = {}
        words = words.lower().split()
        for word in words:
            if ord('a') <= ord(word[-1]) <= ord('z'):
                pass
            elif ord('A') <= ord(word[-1]) <= ord('Z'):
                pass
        word_dict = {word:words.count(word) for word in set(words)}
        return(word_dict)

{'hello': 2, 'my': 1, 'name': 1, 'is': 1}

Edit:

as brianpck, points out:

def word_distribution(words):
        word_dict = {}
        words = words.lower().split()
        word_dict = {word:words.count(word) for word in set(words)}
        return(word_dict)

also will give the same result.

Upvotes: 1

saurabh baid
saurabh baid

Reputation: 1877

There are certainly better way of achieving what you are trying to do but this answer fixes your code.

Strings are immutable and lists are mutable. Nowhere in your code you were modifying the list. and words[-1] wont have any impact because you were not re assigning it and string are immutable

def word_distribution(words):
        word_dict = {}
        words = words.lower()
        words = words.split()
        for word in words:
            index = words.index(word)
            if ord('a') <= ord(word[-1]) <= ord('z'):
                pass
            elif ord('A') <= ord(word[-1]) <= ord('Z'):
                pass
            else: 
                word = word[:-1]
                words[index] = word 

        word_dict = {word:words.count(word) for word in set(words)}
        return(word_dict)

Upvotes: 1

coder
coder

Reputation: 12972

You are making it too complicated, as Sohier Dane mentioned in the comments you can make use of the other post to remove punctuation and simplify the script to:

import string
def word_distribution(words):
    words = words.translate(None, string.punctuation).lower()
    d = {}
    for w in words.split():
        if w not in d.keys():
            d[w] = 1
        else:
            d[w] += 1   
    return d

Results:

>>> x='Hello My Name Is hello.'
>>> print word_distribution(x)  
>>> {'is': 1, 'my': 1, 'hello': 2, 'name': 1}

Upvotes: 1

Related Questions