Reputation: 1413
I am trying to make a chatbot and to do that i have to perform two main task 1st is Intent Classification and other is Entity recognition but i stuck in Intent classification. Basically i am developing a chatbot for Ecommerce site and my chatbot have very specific use case, my chatbot has to negotiate with customers on the price of products, thats it. To keep things simple and easy i am just considering 5 intents.
To train a classifier on these intents i have trained a Naive Bayes classifier on my little hand written corpus of data, but that data is too too and too less to train a good classifier. I have searched on internet a lot and looked into every machine learning data repository (kaggle, uci, etc) but cannot find any data for my such specific use case. Can you guys guide me what should i do in that case. If i got a big data like i want then i will try Deep learning classifier which will far better for me. Any help would be highly appreciated.
from textblob.classifiers import NaiveBayesClassifier
import joblib # This is used to save the trained classifier in pickle format
training_data = [
('i want to buy a jeans pent', 'Buy_a_product'),
('i want to purchase a pair of shoes', 'Buy_a_product'),
('are you selling laptops', 'Buy_a_product'),
('i need an apple jam', 'Buy_a_product'),
('can you please tell me the price of this product', 'Buy_a_product'),
('please give me some discount.', 'negotition'),
("i cannot afford such price", 'negotition'),
("could you negotiate", "negotition"),
("i agree on your offer", "success"),
("yes i accepcted your offer", "success"),
("offer accepted", "success"),
("agreed", "success"),
("what is the price of this watch", "ask_for_price"),
("How much it's cost", "ask_for_price"),
("i will only give you 3000 for this product", "counter_offer"),
("Its too costly i can only pay 1500 for it", "counter_offer"),
]
clf = NaiveBayesClassifier(training_data)
joblib.dump(clf, 'intentClassifier.pkl')
Upvotes: 3
Views: 2162
Reputation: 126
This is actually a great problem to try deep learning. As you probably already know: language models are few shot learners (https://arxiv.org/abs/2005.14165)
If you are not familiar with language model, I can explain a little bit here. Otherwise, you can skip this section. Basically, the area of NLP has got great progress by doing generative pre-training on unlabeled data. A popular example is BERT. The idea is that you can train a model on a language modeling task (e.g. next word prediction.) By training on such tasks, the model will be able to learn well the "world-knowledge". Then, when you want to use the model for other tasks, you do not need that many labeled training data. You can take a look at this video (https://www.youtube.com/watch?v=SY5PvZrJhLE) if you are interested in knowing more.
For your problem specifically, I have adapt a colab (that I prepared for my UC class) for your application: https://colab.research.google.com/drive/1dKCqwNwPCsLfLHw9KkScghBJkOrU9PAs?usp=sharing In this colab, we use a pre-trained BERT provided by Google Research, and fine-tune on your labeled data. The fine-tuning process is very fast and takes about 1 minute. The colab should work out-of-the-box for you as colab provides GPU supports to train the model. Practically, I think you many need to hand generate a more diverse set of training data, but I do not think you need to have huge data sets.
Upvotes: 2