user11082875
user11082875

Reputation:

How can I get substring or 2 words from the list on matching text?

I have a list where all the company name is there

organizations={'mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services'}

I have a text where I have 1 or 2 company names and I want to extract those company names from the organizations. example:

text = 'XXX has worked in Tata Consultancy Services and currently working in cognizant technology.He has experience in Java Technology as well'

How can I fetch company from the text.

Upvotes: 0

Views: 49

Answers (1)

DirtyBit
DirtyBit

Reputation: 16792

OP: How can I fetch company from the text.

That would be complex, the other way around would be easier and faster:

You could iterate over the organisations and check if any of them exists in the text using in:

organizations = ['mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services']

text = 'XXX has worked in Tata Cosultancy Services and currently working in cognizant technology.He has experience in Java Technology as well'

for org in organizations:
    if org.lower() in text.lower():
        print(org)

EDIT:

To get all the organisation, use a string comparison with .lower() for case insensitive texts.

EDIT 2:

Using re:

import re
for org in organizations:
    if re.search(org, text, re.IGNORECASE):
        print(org)

OUTPUT:

cognizant Technology
Tata Cosultancy Services

EDIT 3:

Considering a situation where the element in the list exists in the text but only partially. You could use the word search using regex i.e.

organizations = ['mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services', 'nitor']

text = 'XXX has worked in Tata Cosultancy Services and currently working in cognizant technology.He has experience in Java Technology as well as monitor'


import re
for org in organizations:
    if re.search('\\b' +org+ '\\b', text, re.IGNORECASE):
        print(org)

OUTPUT:

cognizant Technology
Tata Cosultancy Services

Upvotes: 1

Related Questions