Reputation: 27
So, I'm making my own home assistant and I'm trying to make a multi-intent classification system. However, I cannot find a way to split the query said by the user into the multiple different intents in the query.
For example:
I have my data for one of my intents (same format for all)
{"intent_name": "music.off" , "examples": ["turn off the music" , "kill
the music" , "cut the music"]}
and the query said by the user would be:
'dim the lights, cut the music and play Black Mirror on tv'
I want to split the sentence into their individual intents such as :
['dim the lights', 'cut the music', 'play black mirror on tv']
however, I can't just use re.split
on the sentence with and
and ,
as delimiters to split with as if the user asks :
'turn the lights off in the living room, dining room, kitchen and bedroom'
this will be split into
['turn the lights off in the living room', 'kitchen', 'dining room', 'bedroom']
which would not be usable with my intent detection
this is my problem, thank you in advance
okay so I've got this far with my code, it can get the examples from my data and identify the different intents inside as I wished however it is not splitting the parts of the original query into their individual intents and is just matching.
import nltk
import spacy
import os
import json
#import difflib
#import substring
#import re
#from fuzzysearch import find_near_matches
#from fuzzywuzzy import process
text = "dim the lights, shut down the music and play White Collar"
commands = []
def get_matches():
for root, dirs, files in os.walk("./data"):
for filename in files:
f = open(f"./data/{filename}" , "r")
file_ = f.read()
data = json.loads(file_)
choices.append(data["examples"])
for set_ in choices:
command = process.extract(text, set_ , limit=1)
commands.append(command)
print(f"all commands : {commands}")
this returns [('dim the lights') , ('turn off the music') , ('play Black Mirror')]
which is the correct intents but I have no way of knowing which part of the query relates to each intent - this is the main problem
my data is as follows , very simple for now until I figure out a method:
play.json
{"intent_name": "play.device" , "examples" : ["play Black Mirror" , "play Netflix on tv" , "can you please stream Stranger Things"]}
music.json
{"intent_name": "music.off" , "examples": ["turn off the music" , "cut the music" , "kill the music"]}
lights.json
{"intent_name": "lights.dim" , "examples" : ["dim the lights" , "turn down the lights" , "lower the brightness"]}
Upvotes: 1
Views: 2179
Reputation: 11424
It seems that you are mixing two problems in your questions:
shut down the music and play White Collar
)turn the lights off in the living room bedroom and kitchen
).These problems are quite different. Both, however, can be formulated as word tagging problem (similar to POS-tagging) and solved with machine learning (e.g. CRF or bi-LSTM over pretrained word embeddings, predicting label for each word).
The intent labels for each word can be created using BIO notation, e.g.
shut B-music_off
down I-music_off
the I-music_off
music I-music_off
and O
play B-tv_on
White I-tv_on
Collar I-tv_on
turn B-light_off
the I-light-off
lights I-light-off
off I-light-off
in I-light-off
the I-light-off
living I-light-off
room I-light-off
bedroom I-light-off
and I-light-off
kitchen I-light-off
The model would read the sentence and predict the labels. It should be trained on at least hundreds of examples - you have to generate or mine them.
After splitting intents with model trained on such labels, you will have short texts corresponding to a unique intent each. Then for each short text you need to run the second segmentation, looking for slots. E.g. the sentence about the light can be presented as
turn B-action
the I-action
lights I-action
off I-action
in O
the B-place
living I-place
room I-place
bedroom B-place
and O
kitchen B-place
Now the BIO markup hepls much: the B-place
tag separates bedroom
from the living room
.
Both segmentations can in principle be performed by one hierarchical end-to-end model (google semantic parsing if you want it), but I feel that two simpler taggers can work as well.
Upvotes: 2