Waheeb Al-Abyadh
Waheeb Al-Abyadh

Reputation: 379

extract a sentence that contains a list of keywords or phrase using python

I have used the following code to extract a sentence from file(the sentence should contain some or all of the search keywords)

search_keywords=['mother','sing','song']
with open('text.txt', 'r') as in_file:
    text = in_file.read()
    sentences = text.split(".")

for sentence in sentences:
    if (all(map(lambda word: word in sentence, search_keywords))):
        print sentence

The problem with the above code is that it does not print the required sentence if one of the search keywords do not match with the sentence words. I want a code that prints the sentence containing some or all of the search keywords. It would be great if the code can also search for a phrase and extract the corresponding sentence.

Upvotes: 2

Views: 7964

Answers (3)

Chris_Rands
Chris_Rands

Reputation: 41168

It seems like you want to count the number of search_keyboards in each sentence. You can do this as follows:

sentences = "My name is sing song. I am a mother. I am happy. You sing like my mother".split(".")
search_keywords=['mother','sing','song']

for sentence in sentences:
    print("{} key words in sentence:".format(sum(1 for word in search_keywords if word in sentence)))
    print(sentence + "\n")

# Outputs:
#2 key words in sentence:
#My name is sing song
#
#1 key words in sentence:
# I am a mother
#
#0 key words in sentence:
# I am happy
#
#2 key words in sentence:
# You sing like my mother

Or if you only want the sentence(s) that have the most matching search_keywords, you can make a dictionary and find the maximum values:

dct = {}
for sentence in sentences:
    dct[sentence] = sum(1 for word in search_keywords if word in sentence)

best_sentences = [key for key,value in dct.items() if value == max(dct.values())]


print("\n".join(best_sentences))

# Outputs:
#My name is sing song
# You sing like my mother

Upvotes: 5

MaSdra
MaSdra

Reputation: 352

So you want to find sentences that contain at least one keyword. You can use any() instead of all().

EDIT: If you want to find the sentence which contains the most keywords:

sent_words = []
for sentence in sentences:
    sent_words.append(set(sentence.split()))
num_keywords = [len(sent & set(search_keywords)) for sent in sent_words]

# Find only one sentence
ind = num_keywords.index(max(num_keywords))
# Find all sentences with that number of keywords
ind = [i for i, x in enumerate(num_keywords) if x == max(num_keywords)]

Upvotes: 0

EvensF
EvensF

Reputation: 1610

If I understand correctly, you should be using any() instead of all().

if (any(map(lambda word: word in sentence, search_keywords))):
    print sentence

Upvotes: 0

Related Questions