Reputation: 317
Trying chatbot development for Sinhala Language using RASA NLU.
My config.yml
pipeline:
- name: "WhitespaceTokenizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "EmbeddingIntentClassifier"
And in data.json I have added sample data as below. When I train nlu model and try sample input to extract, "සිංහලෙන්" as medium, it only outputs the intent and the entity value, and not the entity. What am i doing wrong?
{
"text": "සිංහලෙන් දේශන පවත්වන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
},
{
"text": "සිංහලෙන් lectures කරන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
}
The response I get when testing the nlu model is
{'intent':
{'name': 'ask_langmedium', 'confidence': 0.9747527837753296}, 'entities':
[{'start': 10,
'end': 18,
'value': 'සිංහලෙන්',
'entity': '-',
'confidence': 0.5970129041418675,
'extractor': 'CRFEntityExtractor'}],
'intent_ranking': [
{'name': 'ask_langmedium', 'confidence': 0.9747527837753296},
{'name': 'ask_langmedium_request_possibility', 'confidence':
0.07433460652828217}],
'text': 'උගන්නන්නේ සිංහලෙන් ද ?'}
Upvotes: 0
Views: 156
Reputation: 91
If this is your completed dataset then I am not sure how are you able to generate the model because rasa requires at least two intents. I added another intent with hello and rest of the data I just replicated your data in my own code and it worked out well and this is the output I've got.
Enter a message: උගන්නන්නේ සිංහලෙන් ද?
{
"intent": {
"name": "ask_medium",
"confidence": 0.9638749361038208
},
"entities": [
{
"start": 10,
"end": 18,
"value": "\u0dc3\u0dd2\u0d82\u0dc4\u0dbd\u0dd9\u0db1\u0dca",
"entity": "medium",
"confidence": 0.7177257810884379,
"extractor": "CRFEntityExtractor"
}
]
}
This is my full Code
DataSet.json
{
"rasa_nlu_data": {
"common_examples": [
{
"text": "හෙලෝ",
"intent": "hello",
"entities": []
},
{
"text": "සිංහලෙන් දේශන පවත්වන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
},
{
"text": "සිංහලෙන් lectures කරන්නේ නැද්ද?",
"intent": "ask_medium",
"entities": [{
"start":0,
"end":8,
"value": "සිංහලෙන්",
"entity": "medium"
}]
}
],
"regex_features" : [],
"lookup_tables" : [],
"entity_synonyms": []
}
}
nlu_config.yml
pipeline: "supervised_embeddings"
Training Command
python -m rasa_nlu.train -c ./config/nlu_config.yml --data ./data/sh_data.json -o models --fixed_model_name nlu --project current --verbose
& testing.py
from rasa_nlu.model import Interpreter
import json
interpreter = Interpreter.load('./models/current/nlu')
def predict_intent(text):
results = interpreter.parse(text)
print(json.dumps({
"intent": results["intent"],
"entities": results["entities"]
}, indent=2))
keep_asking = True
while(keep_asking):
text = input('Enter a message: ')
if (text == 'exit'):
keep_asking = False
break
else:
predict_intent(text)
Upvotes: 1