Henk Straten
Henk Straten

Reputation: 1445

Find tuples with certain keywords

I have a tuple with 3grams that looks like this:

from nltk import ngrams
test_data = ["this is all test data", "this not"]

three_gram_list = []
for data in test_data:
 three_grams = ngrams(data.split(" "), 3)
 for gram in three_grams:
  three_gram_list.append(gram)

What I would like to do is to create a function that checks for each 3-gram whether words are used in the same tuple. Therefore I did the following:

def create_specific_trigram(three_grams, parameters1, parameters2):

 condition1 = False
 condition2 = False

 for three in three_grams:
     for num in range(1, 3):
         if three[num] in parameters1:
            condition1 = True

      for num in range(1, 3):
          if three[num] in parameters2:
              condition2 = True

      if condition1 and condition2:
          print(three)

However I run it now with some parameters:

parameters1 = ("test", "testing")
parameters2 = ("data", "datas")

for sentence in test_data:
  create_specific_trigram(three_grams, paramaters1, parameters2)

I get the following output.

('all', 'test', 'data')
('all', 'test', 'data')    

However I am only looking for one output per sentence. So in this case:

('all', 'test', 'data')

Any thoughts on what changes I should apply?

Upvotes: 1

Views: 77

Answers (1)

Jaroslaw Matlak
Jaroslaw Matlak

Reputation: 574

When launching the function create_specific_trigram, you launch it for the same value of three_grams, independent from sentence.

Try this:

test_data = ["this is all test data", "this not"]
parameters1 = ("test", "testing")
parameters2 = ("data", "datas")

#============================================
#implementation of create_specific_trigram
# ...
#============================================

for sentence in test_data:
  three_grams = ngrams(sentence.split(" "), 3)
  create_specific_trigram(three_grams, paramaters1, parameters2)

Upvotes: 1

Related Questions