Reputation: 88
This is the data I'm working with:
Topic About Group Discussion
microwave is not working i tried turning on the microwave and it wont turn on [[person1 yeah the microwave wont turn on i tested it], [person2 okay send it over to warranty], ...]
the light of the oven wont turn on i have tried to press the light on the oven and nothing [[person3 did you power on the oven], [person4 it was powered on], ...]
water will not come out of sink i turn the valve and nothing comes out of the sink [[person5 okay it looks like water is not coming out of the sink], [person6 okay send it over to this department to take a look], ...]
What I would like is this:
Topic About Group Discussion Topic_Extract About_Extract Group_Discussion_Extract
microwave is not working i tried turning on the microwave and it wont turn on [[person1 yeah the microwave wont turn on i tested it], [person2 okay send it over to warranty], ...] microwave microwave microwave
the light of the oven wont turn on i have tried to press the light on the oven and nothing [[person3 did you power on the oven], [person4 it was powered on], ...] oven oven oven
water will not come out of sink i turn the valve and nothing comes out of the sink [[person5 okay it looks like water is not coming out of the sink], [person6 okay send it over to this department to take a look], ...] sink sink sink
EDIT: Okay, now it's saying everything is 'unclassified' not sure how to fix this:
df['Title_Extract'] = ''
def loop(data):
for i,j in data['Topic'].iteritems():
if (re.search(r'microwave|microwave will not turn on|microwave is not working|microwave wont work|microwave will not work|microwave is broken', j) == True):
return(data['Topic_Extract'].str.replace('', 'microwave'))
elif (re.search(r'oven|oven will not turn on|oven is not working|oven wont work|oven will not work|oven is broken|oven wont turn on', j) == True):
return(data['Topic_Extract'].str.replace('', 'oven'))
elif (re.search(r'sink|sink will not turn on|sink is not working|sink wont work|sink will not work|sink is broken|sink wont turn on', j) == True):
return(data['Topic_Extract'].str.replace('', 'sink'))
else:
return 'unclassified'
loop(df)
I am running into the following error when I'm trying to extract a word - not classifying correctly:
0 unclassified
...
2 unclassified
Upvotes: 0
Views: 64
Reputation: 88
I figured it out, thanks for the help everyone. This is what my solution looks like:
def loop(data):
for i,j in data['Topic'].iteritems():
if (re.search(r'\bmicrowave\b', j)):
data['Topic Extract'][i].append('microwave')
elif (re.search(r'\boven\b', j)):
data['Topic Extract'][i].append('oven')
elif (re.search(r'\bsink\b', j)):
data['Topic Extract'][i].append('sink')
else:
data['Topic Extract'][i].append('unclassified')
return data
df = loop(df)
Upvotes: 0
Reputation: 11650
create a list of values to search. then use findall to return the values that are found in the df column
topic_terms = ['microwave','sink', 'oven']
df['term']=df['Topic'].str.findall("|".join(terms))
df
data used
data = {'Topic': {0: 'microwave is not working ',
1: 'the light of the oven wont turn on ',
2: 'water will not come out of sink '},
'About': {0: 'i tried turning on the microwave and it wont turn on ',
1: 'i have tried to press the light on the oven and nothing ',
2: 'i turn the valve and nothing comes out of the sink '},
'Group Discussion': {0: '[[person1 yeah the microwave wont turn on i tested it], [person2 okay send it over to warranty], ...]',
1: '[[person3 did you power on the oven], [person4 it was powered on], ...]',
2: '[[person5 okay it looks like water is not coming out of the sink], [person6 okay send it over to this department to take a look], ...'}}
df=pd.DataFrame(data)
df
Topic About Group Discussion term
0 microwave is not working i tried turning on the microwave and it wont t... [[person1 yeah the microwave wont turn on i te... microwave
1 the light of the oven wont turn on i have tried to press the light on the oven an... [[person3 did you power on the oven], [person4... oven
2 water will not come out of sink i turn the valve and nothing comes out of the ... [[person5 okay it looks like water is not comi... sink
Upvotes: 1