Reputation:
I have a list where all the company name is there
organizations={'mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services'}
I have a text where I have 1 or 2 company names and I want to extract those company names from the organizations. example:
text = 'XXX has worked in Tata Consultancy Services and currently working in cognizant technology.He has experience in Java Technology as well'
How can I fetch company from the text.
Upvotes: 0
Views: 49
Reputation: 16792
OP: How can I fetch company from the text.
That would be complex, the other way around would be easier and faster:
You could iterate over the organisations
and check if any of them exists in the text
using in
:
organizations = ['mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services']
text = 'XXX has worked in Tata Cosultancy Services and currently working in cognizant technology.He has experience in Java Technology as well'
for org in organizations:
if org.lower() in text.lower():
print(org)
EDIT:
To get all the organisation, use a string comparison with .lower()
for case insensitive texts.
EDIT 2:
Using re
:
import re
for org in organizations:
if re.search(org, text, re.IGNORECASE):
print(org)
OUTPUT:
cognizant Technology
Tata Cosultancy Services
EDIT 3:
Considering a situation where the element in the list
exists in the text
but only partially. You could use the word search using regex
i.e.
organizations = ['mahindra & mahindra','atametica','cognizant Technology','Tata Cosultancy Services', 'nitor']
text = 'XXX has worked in Tata Cosultancy Services and currently working in cognizant technology.He has experience in Java Technology as well as monitor'
import re
for org in organizations:
if re.search('\\b' +org+ '\\b', text, re.IGNORECASE):
print(org)
OUTPUT:
cognizant Technology
Tata Cosultancy Services
Upvotes: 1