Oliver Amundsen
Oliver Amundsen

Reputation: 1511

Compute similarity of all elements in a list, to a single sentence

  1. I need to quantify the similarity of all the sentences in a long list, to a single sentence. Perhaps using Levenshtein or difflib.
  2. Then, I have to remove those sentences of the list that exceed some given threshold, by, say, 90% in difflib.

Could you guys help? Thanks!

Upvotes: 2

Views: 382

Answers (1)

wim
wim

Reputation: 363516

>>> mylist = ['ham and eggs', 'spam and legs', "it's time to die, mr bond!"]
>>> import difflib
>>> close_matches = difflib.get_close_matches('spam and eggs', mylist)
>>> close_matches
['spam and legs', 'ham and eggs']
>>> set(mylist) - set(close_matches)
set(["it's time to die, mr bond!"])

Upvotes: 4

Related Questions