Reputation: 837
For a programming lab my assignment is to write a program that checks the spelling of a word. I am doing this all on my own so this is basically my last resort.The program should work like this: iterate through all lines of the document you want to check.If a word is not in the dictionary, print the word and the line where you found it.
I have to use a dictionary file in which all words are capitalized.The file that I'm checking for correct spelling isn't. So somewhere I have to capitalize the words, but I cannot figure out where. Every time I run this code it just print every line in the AliceInWonderLand200.txt.
import re
def split_line(line):
return re.findall('[A-Za-z]+9(?:\'[A-Za-z]+)',line)
file = open("dictionary.txt")
dictionary = []
for line in file:
line = line.strip()
dictionary.append(line)
file.close()
print("----Linear search-----")
file2 = open("AliceInWonderLand200.txt")
i = 0
for line in file2:
words = []
words.append(split_line(line))
for word in line:
i+= 1
word = word.upper()
if word not in dictionary:
print("Line ",i,": probably misspelled: ", word)
file.close()
I have tried to use words.append(split_line(line.upper()),but that didn't work. I have tried to assign word to word.upper(), that didn't work either. Every time when I run this code it just prints every line in the AliceInWonderLand200.txt.
I have looked everywhere to find a satisfying answer. I have found the same question here on stackoverflow, but I didn't really understand the answer Python Spell Checker Linear Search
I have added the task and the output that I should have to make it easier for you guys.
--- Linear Search ---
Line 3 possible misspelled word: Lewis
Line 3 possible misspelled word: Carroll
Line 46 possible misspelled word: labelled
Line 46 possible misspelled word: MARMALADE
Line 58 possible misspelled word: centre
Line 59 possible misspelled word: learnt
Line 69 possible misspelled word: Antipathies
Line 73 possible misspelled word: curtsey
Line 73 possible misspelled word: CURTSEYING
Line 79 possible misspelled word: Dinah'll
Line 80 possible misspelled word: Dinah
Line 81 possible misspelled word: Dinah
Line 89 possible misspelled word: Dinah
Line 89 possible misspelled word: Dinah
Line 149 possible misspelled word: flavour
Line 150 possible misspelled word: toffee
Line 186 possible misspelled word: croquet
the task: http://programarcadegames.com/index.php?chapter=lab_spell_check
Upvotes: 1
Views: 339
Reputation: 114035
First of all, you're better off using a set
to hold your dictionary words, for better lookup speeds. Also, it would help to lowercase all the words in your dictionary to make comparisons more uniform.
with open('dictionary.txt') as infile:
dictionary = {line.strip().lower() for line in infile}
print("----Linear search-----")
with open('AliceInWonderLand200.txt') as infile:
for i,line in enumerate(infile, 1):
line = line.strip()
words = split_line(line) # your split_line function
for word in words:
if word.lower() not in dictionary:
print("Line ", i, ": probably misspelled: ", word)
Hope this helps
Upvotes: 1
Reputation: 27812
You can lowercase the words in the dictionary:
for line in file:
line = line.strip().lower()
dictionary.append(line)
and lowercase the word that you are checking for:
for word in line:
i += 1
word = word.lower()
...
Upvotes: 0