gargi jha
gargi jha

Reputation: 11

could someone tell me why this code is not working as expected?

def ethos(file):
    """Open a local file, convert its content into tokens.Match tokens with list provided, return matching list."""
    f = open(file)
    raw = f.read()
    tokens = nltk.word_tokenize(raw)
    list = [ 'perfect' ,'companion' , 'good' , 'brilliant', 'good']
    for tokens in list:
        return tokens

I wrote this code with the idea that it should return all the tokens in the text which matches the list defined, but it is returning only one token and that too the one in the beginning of the list I also tried to add and empty list and append the matching words but it doesn't seems to work, so kindly let me know if any body has any ideas, please reply soon

Upvotes: 1

Views: 57

Answers (2)

Henry Keiter
Henry Keiter

Reputation: 17168

There are a few issues here, but the main point is that a function will only execute the first return it comes across. So you loop through each item in the list, and return the first one--at which point the function stops executing, because it returned.

I think what you want is to check each word in the text to see whether it's in your list, and then return all the matching words. To do that, you need to actually perform a comparison somewhere, which you're not doing at the moment. You might rewrite your loop to look something like this:

# Don't use "list" as a variable name! Also, there's no need for two "good" entries.
words_to_match = ['perfect' ,'companion' , 'good' , 'brilliant']

matching_tokens = []
for token in tokens:
    if token in words_to_match:
        matching_tokens.append(token) # add the matching token to a list
return matching_tokens # finally, return all the tokens that matched

Once you understand what it is that you're doing with the explicit loop above, note that you can rewrite the whole thing as a simple list comprehension:

words_to_match = {'perfect' ,'companion' , 'good' , 'brilliant'} # using a set instead of a list will make the matching faster
return [t for t in tokens if t in words_to_match]

Upvotes: 1

Cory Kramer
Cory Kramer

Reputation: 117876

I think you meant to do

return [i for i in tokens if i in list]

The way you wrote it, it will iterate through each word in list. But the first thing it does in the loop is return. So all it will do is return the word 'perfect' every time regardless of what comes back in tokens. So the modified code (assuming everything else functions correctly) would be

def ethos(file):
"""Open a local file, convert its content into tokens.Match tokens with list
    provided, return matching list."""
    f = open(file)
    raw = f.read()
    tokens = nltk.word_tokenize(raw)
    list = [ 'perfect' ,'companion' , 'good' , 'brilliant', 'good']
    return [i for i in tokens if i in list]

Also, some miscellaneous tips:

  1. Don't name that variable list because you are name shadowing
  2. Your variable list could be a set then you could have O(1) lookup times instead of O(N)

Upvotes: 1

Related Questions