Reputation: 207
I am trying to build simple chatbot application using Rasa
, but my bot is giving confidence 0 if there is an underscore in the word.
Below is my config.yml configuration:
language: en
pipeline: supervised_embeddings
policies:
- name: KerasPolicy
#- name: MappingPolicy
#- name: MemoizationPolicy
#- name: FallbackPolicy
nlu.md configuration:
## intent:name
- name
- nmae
- nme
- what is my name?
## intent: firstname
- firstName
- FName
- first name
## intent: gender
- gender
- sex
- gnder
- gendr
- sx
## intent: lastname
- lastName
- lname
- surname
- lstnme
- lstname
## intent: username
- userName
- uname
- usrnme
- usernme
- userid
If I pass firstname
I am getting the correct intent and confidence and if I try with _firstname
or first_name
I am getting the below result:
first_name
{
"intent": {
"name": null,
"confidence": 0.0
},
"entities": [],
"intent_ranking": [],
"text": "first_name"
}
Upvotes: 0
Views: 220
Reputation: 1274
You're getting 0 confidence precisely because you've used underscore in your word. The word first_name
hasn't been used in your training data so, that word is foreign to your model. That's why it doesn't predict anything for that word. (By default, it uses a whitespace tokenizer
so words are only tokenized by whitespace.)
So, to fix your issue, just don't use underscore in your word or you can edit the whitespace tokenizer to tokenize by whitespace and underscore.
Hope that helps.
Upvotes: 1