How to remove luis entity marker from utterance

Question

I am using LUIS to determine which state a customer lives in. I have set up a list entity called "state" that has the 50 states with their two-letter abbreviations as synonyms as described in the documentation. LUIS is returning certain two letter words, such as "hi" or "in" as state entities.

I have set up an intent with phrases such as "My state is Oregon", "I am from WA", etc. Inside the intent, if the word "in" is included in the utterance, for example in the utterance "I live in Kentucky", the word "in" is marked automatically by LUIS as a state entity and I am unable to remove that marker.

Below is a snip of the LUIS json response to the utterance "I live in Kentucky". As you can see, the response includes both Indiana and Kentucky as entities when there should only be Kentucky.

 "query": "I live in Kentucky",
  "topScoringIntent": {
    "intent": "STATE_INQUIRY",
    "score": 0.9338141
  },
....
    "entities": [
....
    {
      "entity": "in",
      "type": "state",
      "startIndex": 7,
      "endIndex": 8,
      "resolution": {
        "values": [
          "indiana"
        ]
      }
    },
    {
      "entity": "kentucky",
      "type": "state",
      "startIndex": 10,
      "endIndex": 17,
      "resolution": {
        "values": [
          "kentucky"
        ]
      }
    }
  ], ....

How do I train LUIS not to mark the words "in" and "hi" in this context as states if I can't remove the intent marker from the utterance?

hengist · Accepted Answer

@StevenKanberg's answer was very helpful but unfortunately not complete for my situation. I tried to implement both geographyV2 and Places.AbsoluteLocation (separately). Neither one works entirely in the way I need it to (recognizing states and their two-letter abbrevs in a way that can be queried from the entities in the response).

So my choices are:

Create my own list of states, using the state name and the two-letter abbrev as synonyms, as described in the list description itself. This works except for two letter abbrevs that are also words, such as "in", "hi" and "me".
Use geographyV2 prebuilt which does not allow synonyms and does not recognize two-letter abbrevs at all, or
Use Places.AbsoluteLocation which does recognize two-letter abbrevs for states, does not confuse them with words, but also grabs all locations including cities, countries and addresses and does not differentiate between them so I have no way of parsing which entity is the state in an utterance like "I live in Lake Stevens, Snohomish County, WA".

Solution: If I combine 1 with 3, I can query for entities that have both of those types. If LUIS marks the word "in" as a state (Indiana), I can then check to see if that word has also been flagged as an AbsoluteLocation. If it has not, then I can safely discard that entity. It's not ideal but is a workaround that solves the problem.

How to remove luis entity marker from utterance

Answers (2)

Related Questions