Jia Long Liu
Jia Long Liu

Reputation: 29

Average word length - punctuation (",", ".", ";") and newline characters {"\n"}

I want to write a Python script that will read a test from the console and output the average number of characters per word,but i have some problem with punctuation and newline characters . there is my code.

def main():
    allwords = []
    while True:
         words = input()
         if words == "Amen.":
           break
         allwords.extend(words.split())
    txt = " ".join( allwords)
    n_pismen = len([c for c in txt if c.isalpha()])
    n_slov = len([i for i in range(len(txt) - 1) if txt[i].isalpha() and not txt[i + 1].isalpha()])
    for char in (',', '.', ';'):
        txt = txt.replace(char, '')
    txt.replace('\n', ' ')
    words = txt.split()
    print(sum(len(word) for word in words) / len(words))
    if words:
        average = sum(len(words) for words in  allwords) / len( allwords) 

if __name__ == '__main__':
    main() 

Our Father, which art in heaven,
hallowed be thy name;
thy kingdom come;
thy will be done,
in earth as it is in heaven.
Give us this day our daily bread.
And forgive us our trespasses,
as we forgive them that trespass against us.
And lead us not into temptation,
but deliver us from evil.
For thine is the kingdom,
the power, and the glory,
For ever and ever.
Amen.

normal will be output 4.00,but i just get 1.00

Upvotes: 1

Views: 133

Answers (4)

match
match

Reputation: 11060

You can do this as follows (where strng is the passage of text):

# Remove all of the 'bad' characters
for char in (',', '.', ';'):
    strng = strng.replace(char, '')
strng.replace('\n', ' ')
# Split on spaces
words = strng.split()
# Calculate the average length
print(sum(len(word) for word in words) / len(words))

Upvotes: 1

defladamouse
defladamouse

Reputation: 625

Not sure what's wrong in your example, but this will work. I used "test" as the string name, you can modify that as desired:

counts = [] #List to store number of characters per word
for t in test.split(): #Split to substrings at whitespace
    counts.append(len([c for c in t if c.isalpha()])) #Calculate the length of each word ignoring non-letters
print(sum(counts)/len(counts)) #Compute the average

Upvotes: 2

Ratery
Ratery

Reputation: 2917

Try this:

import string

s = input()
s = s.translate(str.maketrans('', '', string.punctuation)
s.replace('\n', ' ')
words = s.split()
print(sum(map(len, words)) / len(words))

Upvotes: 1

nikeros
nikeros

Reputation: 3379

I would match every word using a regex, than keep track on # of words and # of total characters:

import re

total_number = 0
n_words = 0

pattern = re.compile("[a-z]+", re.IGNORECASE)

with open({PATH_TO_YOUR_FILE}, "r") as f:
    for line in f:
        words = pattern.findall(line)
        n_words += len(words)
        total_number += sum([len(x) for x in words])

print(total_number/n_words)

OUTPUT

4.0    

Upvotes: 1

Related Questions