Reputation: 93
So , I am having two files , so to checks its validity I am performing try
and except
two times . But I don't thinks this is a good method, can you suggest a better way?
Here is my code:
def form_density_dictionary(self,word_file,fp_exclude):
self.freq_dictionary={}
try:
with open(fp_exclude,'r')as fp2:
words_excluded=fp2.read().split() #words to be excluded stored in a list
print("**Read file successfully :" + fp_exclude + "**")
words_excluded=[words.lower() for words in words_excluded] # converted to lowercase
except IOError:
print("**Could not read file:", fp_exclude, " :Please check file name**")
sys.exit()
try:
with open(word_file,'r') as file:
print("**Read file successfully :" + word_file + "**")
words_list=file.read()
if not words_list:
print("**No data in file:",word_file +":**")
sys.exit()
words_list=words_list.split()
words_list=[words.lower() for words in words_list] # lowercasing entire list
unique_words=list((set(words_list)-set(words_excluded)))
self.freq_dictionary= {word:("%6.2f"%(float((words_list.count(word))/len(words_list))*100)) for word in unique_words}
#print((len(self.freq_dictionary)))
except IOError:
print("**Could not read file:", word_file, " :Please check file name**")
sys.exit()
Any other suggestion is also welcomed to make it more pythonic.
Upvotes: 0
Views: 356
Reputation: 2274
The first thing that jumps out is the lack of consistency and readability: in some lines you indent with 4 spaces, on others you only use two; in some places you put a space after a comma, in others you don't, in most places you don't have spaces around the assignment operator (=
)...
Be consistent and make your code readable. The most commonly used formatting is to use four spaces for indenting and to always have a space after a comma but even more important than that is to be consistent, meaning that whatever you choose, stick with it throughout your code. It makes it much easier to read for everyone, including yourself.
Here are a few other things I think you could improve:
Have a single exception handling block instead of two.
You can also open both files in a single line.
Even better, combine both previous suggestions and have a separate method to read data from the files, thus eliminating code repetition and making the main method easier to read.
For string formatting it's preferred to use .format()
instead of %
. Check this out: https://pyformat.info/
Overall try to avoid repetition in your code. If there's something you're doing more than once, extract it to a separate function or method and use that instead.
Here's your code quickly modified to how I'd probably write it, and taking these things into account:
import sys
class AtifImam:
def __init__(self):
self.freq_dictionary = {}
def form_density_dictionary(self, word_file, exclude_file):
words_excluded = self.read_words_list(exclude_file)
words_excluded = self.lowercase(words_excluded)
words_list = self.read_words_list(word_file)
if len(words_list) == 0:
print("** No data in file: {} **".format(word_file))
sys.exit()
words_list = self.lowercase(words_list)
unique_words = list((set(words_list) - set(words_excluded)))
self.freq_dictionary = {
word: ("{:6.2f}".format(
float((words_list.count(word)) / len(words_list)) * 100))
for word in unique_words
}
@staticmethod
def read_words_list(file_name):
try:
with open(file_name, 'r') as file:
data = file.read()
print("** Read file successfully: {} **".format(file_name))
return data.split()
except IOError as e:
print("** Could not read file: {0.filename} **".format(e))
sys.exit()
@staticmethod
def lowercase(word_list):
return [word.lower() for word in word_list]
Upvotes: 1
Reputation: 1410
To be more 'pythonic' you could use something what is callec Counter, from collections library.
from collections import Counter
def form_density_dictionary(self, word_file, fp_exclude):
success_msg = '*Read file succesfully : {filename}'
fail_msg = '**Could not read file: {filename}: Please check filename'
empty_file_msg = '*No data in file :{filename}:**'
exclude_read = self._file_open(fp_exclude, success_msg, fail_msg, '')
exclude = Counter([word.lower() for word in exclude_read.split()])
word_file_read = self._file_open(word_file, success_msg, fail_msg, empty_file_msg)
words = Counter([word.lower() for word in word_file_read.split()])
unique_words = words - excluded
self.freq_dictionary = {word: '{.2f}'.format(count / len(unique_words))
for word, count in unique_words.items()}
Also it would be better if you would just create the open_file method, like:
def _open_file(self, filename, success_msg, fails_msg, empty_file_msg):
try:
with open(filename, 'r') as file:
if success_msg:
print(success_msg.format(filename= filename))
data = file.read()
if empty_file_msg:
print(empty_file_msg.format(filename= filename))
return data
except IOError:
if fail_msg:
print(fail_msg.format(filename= filename))
sys.exit()
Upvotes: 1
Reputation: 160377
Exceptions thrown that involve a file system path have a filename
attribute that can be used instead of explicit attributes word_file
and fp_exclude
as you do.
This means you can wrap these IO operations in the same try-except
and use the exception_instance.filename
which will indicate in which file the operation couldn't be performed.
For example:
try:
with open('unknown_file1.py') as f1, open('known_file.py') as f2:
f1.read()
f2.read()
except IOError as e:
print("No such file: {0.filename}".format(e))
Eventually prints out:
No such file: unknown_file1.py
While the opposite:
try:
with open('known_file.py') as f1, open('unknown_file2.py') as f2:
f1.read()
f2.read()
except IOError as e:
print("No such file: {0.filename}".format(e))
Prints out:
No such file: unknown_file2.py
Upvotes: 1