Reputation: 11
The code is meant to take a file as input, change all the letters to lowercase and get rid of any non alphabetical characters. Then it should print out the recurrence of each word in the file.
#!/usr/bin/python
import sys
def main(argv):
try:
tf = open(sys.argv[1],"r")
except IOError:
print("The file ",tf," was not found")
sys.exit()
data = tf.read()
data.lower()
data.replace("-"," ")
validLetters = " abcdefghijklmnopqrstuvwxyz"
cleanData = ''.join([i for i in data if i in validLetters])
frequency = {}
words = []
words = cleanData.split()
for x in words:
if frequency.has_key(x):
frequency[x] = frequency[x] + 1
else:
frequency[x]
print sorted(frequency.values())
tf.close()
this is what I get in the command line:
$ python -m py_compile q1_word_count.py drake.txt
File "drake.txt", line 1
I Was A Teenage Hacker
^
SyntaxError: invalid syntax
"I Was A Teenage Hacker" is the first line of the text file..
Upvotes: 1
Views: 633
Reputation: 140168
Your script is probably all right, but you're running it in an incorrect way.
You're enabling the py_compile
module
The py_compile module provides a function to generate a byte-code file from a source file
The module takes all arguments from command line, including your text file, and finds syntax errors in it obviously.
just run it like this:
python q1_word_count.py drake.txt
(you can compile your module first: python -m py_compile q1_word_count.py
, in which case you can run the .pyc
generated bytecode file, and changing your .py
file would need a recompilation everytime for a very tiny speed gain at startup, and none when executing, this is bytecode compilation, not dynamic compilation. If you want dynamic compilation, use pypy
)
Upvotes: 1