Reputation: 673
I've seen a few people ask how this would be done, but their questions were 'too broad' so I decided to find out how to do it. I've posted below how.
Upvotes: 0
Views: 5738
Reputation: 11
def words_frequency_counter(filename):
"""Print how many times the word appears in the text."""
try:
with open(filename) as file_object:
contents = file_object.read()
except FileNotFoundError:
pass
else:
word = input("Give me a word: ")
print("'" + word + "'" + ' appears ' +
str(contents.lower().count(word.lower())) + ' times.\n')
Upvotes: 1
Reputation: 41905
Splitting on whitespace isn't sufficient -- split on everything you're not counting and get your case under control:
import re
import sys
file = open(sys.argv[1])
word = sys.argv[2]
print(re.split(r"[^a-z]+", file.read().casefold()).count(word.casefold()))
You can add apostrophes to the inverted pattern [^a-z']
or whatever else you want to include in your count.
Hogan: Colonel, you're asking and answering your own questions. That's tops in German efficiency.
Upvotes: 1
Reputation: 77407
Word counts can be tricky. At a minimum, one would like to avoid differences in capitalization and punctuation. A simple way to take the next step in word counts is to use regular expressions and to convert its resulting words to lower case before we do the count. We could even use collections.Counter
and count all of the words.
import re
# `word_finder(somestring)` emits all words in string as list
word_finder = re.compile(r'\w+').findall
filename = input('filename: ')
word = input('word: ')
# remove case for compare
lword = word.lower()
# `word_finder` emits all of the words excluding punctuation
# `filter` removes the lower cased words we don't want
# `len` counts the result
count = len(list(filter(lambda w: w.lower() == lword,
word_finder(open(filename).read()))))
print(count)
# we could go crazy and count all of the words in the file
# and do it line by line to reduce memory footprint.
import collections
import itertools
from pprint import pprint
word_counts = collections.Counter(itertools.chain.from_iterable(
word_finder(line.lower()) for line in open(filename)))
print(pprint(word_counts))
Upvotes: 1
Reputation: 65
First, you want to open the file. Do this with:
your_file = open('file.txt', 'r')
Next, you want to count the word. Let's set your word as brian
under the variable life
. No reason.
your_file.read().split().count(life)
What that does is reads the file, splits it into individual words, and counts the instances of the word 'brian'. Hope this helps!
Upvotes: 0
Reputation: 673
So to do this, first you must open the file (Assuming you have a file of text called 'text.txt') We do this by calling the open function.
file = open('text.txt', 'r')
The open function uses the syntax: open(file, mode)
The file being the text document, and the mode being how it's opened. ('r' means read only) The read function just reads the file, then split separates each of the words into a list object. Lastly, we use the count function to find how many times the word appears.
word = input('word: ')
print(file.read().split().count(word))
And there you have it, counting words in a text file!
Upvotes: 1