Ayush
Ayush

Reputation: 21

How to find words at a specific place inside the given string

I need to extract the words inside ( ) for a perticular catagory.

I need to find the words inside ( ) in
Technology (Pyrolysis, Gasification)

Also the same for words inside ( ) in
Application (Agriculture, Animal Feed, Health & Beauty Products)

This is my string:
string = ''' U.S. Biochar Market Size, Share & Trends Analysis Report By Technology (Pyrolysis, Gasification), By Application (Agriculture, Animal Feed, Health & Beauty Products), By State, And Segment Forecasts'''

I've written the function but it's not working for all instance for some instance its also fetching
By state, and segment forecast which is not needed.


Eg.

Application = 'Agriculture, Animal Feed, Health & Beauty Products'

Technology = 'Pyrolysis, Gasification'

For a large data set of similar sentences using python programming.

enter code here def check_para(args):
s = args
start = s.find('Technology (')
end = s.find('),', start)
techno = s[start:end][len('Technology ('):]

s = args
start = s.find('Material (')
end = s.find('),', start)
mat = s[start:end][len('Material ('):]

s = args
start = s.find('Product (')
end = s.find('),', start)
prod = s[start:end][len('Product ('):]

s = args
start = s.find('Service (')
end = s.find('),', start)
serv = s[start:end][len('Service ('):]

s = args
start = s.find('Type (')
end = s.find('),', start)
typ = s[start:end][len('Type ('):]

s = args
start = s.find('Form (')
end = s.find('),', start)
form = s[start:end][len('Form ('):]

s = args
start = s.find('Application (')
end = s.find('),', start)
appli = s[start:end][len('Application ('):]

s = args
start = s.find('End Use (')
end = s.find('),', start)
enduse = s[start:end][len('End Use ('):]

s = args
start = s.find('Derivative Grades (')
end = s.find('),', start)
deriv = s[start:end][len('Derivative Grades ('):]

type1 = deriv + form + typ + serv + prod + techno + mat

application = appli + enduse

if len(application) > 0 :
    application = application.replace(', ', '\n')
else:
    application = 'Application I\nApplication II\nApplication III\n'

if len(type1) > 0:
    type1 = type1.replace(', ', '\n')
else:
    type1 = 'Type I\nTypeII\nType III\n'

return application, type1

Upvotes: 0

Views: 45

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522712

Using re.findall we can try:

string = ''' U.S. Biochar Market Size, Share & Trends Analysis Report By Technology (Pyrolysis, Gasification), By Application (Agriculture, Animal Feed, Health & Beauty Products), By State, And Segment Forecasts'''
matches = re.findall(r'(\w+) \((.*?)\)', string)
for match in matches:
    print(match[0] + ' = ' + "'" + match[1] + "'")

This prints:

Technology = 'Pyrolysis, Gasification'
Application = 'Agriculture, Animal Feed, Health & Beauty Products'

Upvotes: 1

Related Questions