Geni Allaine
Geni Allaine

Reputation: 45

How can I modify my functions to use list comprehension?

Specifically when removing the stop letters from this getwords function.

def getwords(fileName):
  file = open(fileName, 'r')
  text = file.read()
  stopletters = [".", ",", ";", ":", "'s", '"', "!", "?", "(", ")", '“', '”']
  text = text.lower()
  for letter in stopletters:
   text = text.replace(letter, "")
  words = text.split()
  return words 

And for the loop in this bigrams function

def compute_bigrams(fileName):
  input_list = getwords(fileName)
  bigram_list = {}
  for i in range(len(input_list) - 1):
    if input_list[i] in bigram_list:
      bigram_list[input_list[i]] = bigram_list[input_list[i]] + [input_list[i + 1]]
    else :
     bigram_list[input_list[i]] = [input_list[i + 1]]
  return bigram_list

Upvotes: 1

Views: 73

Answers (2)

BrainDead
BrainDead

Reputation: 795

You could rewrite it in this way:

def getwords(file_name):
    with open(file_name, 'r') as file:
        text = file.read().lower()

    stop_letters = (".", ",", ";", ":", "'s", '"', "!", "?", "(", ")", '“', '”')
    text = ''.join([letter if letter not in stop_letters else '' for letter in text])

    words = text.split()
    return words

I used context manager for file open, merged some lines (no need to have a special line for .lower()) and used list comprehension to go trough text and add letters but only if that letter is not in stop_letters. After joining that list you get the same results.

Note that you can use generator expression as well which would be even better:

text = ''.join((letter if letter not in stop_letters else '' for letter in text))

And if you really want to save that one line you could just do:

return text.split()

Upvotes: 2

Lukas Thaler
Lukas Thaler

Reputation: 2720

You can do the first replacement without a for loop at all by incorporating a little bit of regex:

import re

pattern = re.compile('''[.,;:"!?()“”]*|'s*''')
pattern.sub('', 'this is a test string (it proves that the replacements work!).')


>>> 'this is a test string it proves that the replacements work'

Though it theoretically is possible to make your second loop into a comprehension, I strongly recommend you don't do it. People, including yourself in a few months' time will have severe problems understanding what it does. As @Alexander Cécile noted in the comments, you can refactor the second loop utilizing for input in input_list, adding to the performance and readability of your code

Upvotes: 2

Related Questions