Reputation: 13
I just wrote a function which prints character percent in a text file. However, I got a problem. My program is counting uppercase characters as a different character and also counting spaces. That's why the result is wrong. How can i fix this?
def count_char(text, char):
count = 0
for character in text:
if character == char:
count += 1
return count
filename = input("Enter the file name: ")
with open(filename) as file:
text = file.read()
for char in "abcdefghijklmnopqrstuvwxyz":
perc = 100 * count_char(text, char) / len(text)
print("{0} - {1}%".format(char, round(perc, 2)))
Upvotes: 1
Views: 1619
Reputation: 104082
You can use a counter and a generator expression to count all letters like so:
from collections import Counter
with open(fn) as f:
c=Counter(c.lower() for line in f for c in line if c.isalpha())
Explanation of generator expression:
c=Counter(c.lower() for line in f # continued below
^ create a counter
^ ^ each character, make lower case
^ read one line from the file
# continued
for c in line if c.isalpha())
^ one character from each line of the file
^ iterate over line one character at a time
^ only add if a a-zA-Z letter
Then get the total letter counts:
total_letters=float(sum(c.values()))
Then the total percent of any letter is c[letter] / total_letters * 100
Note that the Counter c
only has letters -- not spaces. So the calculated percent of each letter is the percent of that letter of all letters.
The advantage here:
0
for letters not in the file;So your entire program becomes:
from collections import Counter
with open(fn) as f:
c=Counter(c.lower() for line in f for c in line if c.isalpha())
total_letters=float(sum(c.values()))
for char in "abcdefghijklmnopqrstuvwxyz":
print("{} - {:.2%}".format(char, c[char] / total_letters))
Upvotes: 1
Reputation: 6518
You should try making the text lower case using text.lower()
and then to avoid spaces being counted you should split the string into a list using: text.lower().split()
. This should do:
def count_char(text, char):
count = 0
for word in text.lower().split(): # this iterates returning every word in the text
for character in word: # this iterates returning every character in each word
if character == char:
count += 1
return count
filename = input("Enter the file name: ")
with open(filename) as file:
text = file.read()
totalChars = sum([len(i) for i in text.lower().split()]
for char in "abcdefghijklmnopqrstuvwxyz":
perc = 100 * count_char(text, char) / totalChars
print("{0} - {1}%".format(char, round(perc, 2)))
Notice the change in perc
definition, sum([len(i) for i in text.lower().split()]
returns the number of characters in a list of words, len(text)
also counts spaces.
Upvotes: 1
Reputation: 14321
You can use the built in .count
function to count the characters after converting everything to lowercase via .lower
. Additionally, your current program doesn't work properly as it doesn't exclude spaces and punctuation when calling the len
function.
import string
filename = input("Enter the file name: ")
with open(filename) as file:
text = file.read().lower()
chars = {char:text.count(char) for char in string.ascii_lowercase}
allLetters = float(sum(chars.values()))
for char in chars:
print("{} - {}%".format(char, round(chars[char]/allLetters*100, 2)))
Upvotes: 0
Reputation: 1611
You want to make the text lower case before counting the char:
def count_char(text, char):
count = 0
for character in text.lower():
if character == char:
count += 1
return count
Upvotes: 0