Training Watson Discovery with (V1) python SDK APIs does not work

Question

I want to use Watson discovery V1 APIs for relevancy training. I tried the following but yet to get desired result. Describing the problem in details below:

I have a set of documents some of them contain the word 'cloud' or 'big data'. I want to search for the word 'hadoop' with the query() api and get back those documents, but discovery query returns nothing.

Now, I want to provide the following training examples to discovery to update the relevance scores so that I get those results back (I used query expansion for the same task and it worked, now i am interested in relevancy training).

I have used the api add_training_data() to associate the query 'hadoop' with the relevant documents (specified by ids, the documents that contain 'cloud', e.g.).

Now the training data looks like the following:

{
  "natural_language_query": "hadoop",
  "filter": "",
  "examples": [
    {
      "document_id": "1ad6f551-e092-4ce9-b08c-eb4f4cbc9458",
      "cross_reference": "",
      "relevance": 1,
      "created": "2020-01-30T23:16:19.674Z",
      "updated": "2020-01-30T23:16:19.716Z"
    },
    {
      "document_id": "f1d11f51-31b2-414f-b359-d5336b019575",
      "cross_reference": "",
      "relevance": 1,
      "created": "2020-01-30T23:16:19.674Z",
      "updated": "2020-01-30T23:16:19.722Z"
    },
    {
      "document_id": "5bfcea6a-c925-4db5-a490-89a9d1de8d4c",
      "cross_reference": "",
      "relevance": 1,
      "created": "2020-01-30T23:16:19.674Z",
      "updated": "2020-01-30T23:16:19.729Z"
    },
    {
      "document_id": "bf07e701-6893-428c-ab16-c5446e821291",
      "cross_reference": "",
      "relevance": 1,
      "created": "2020-01-30T23:16:19.674Z",
      "updated": "2020-01-30T23:16:19.735Z"
    },
    {
      "document_id": "75082812-5c96-4d2e-b388-821a0434ad4c",
      "cross_reference": "",
      "relevance": 1,
      "created": "2020-01-30T23:16:19.674Z",
      "updated": "2020-01-30T23:16:19.742Z"
    }
  ],
  "query_id": "cc1d3677eeafe70929aeccfb462860439f61b051",
  "created": "2020-01-30T23:16:19.677Z",
  "updated": "2020-01-30T23:16:19.677Z"
}

where the document ids correspond to the documents in the collection, the ones that contain the word 'cloud'. e.g.

With the training data created, now i wanted to run the earlier query again with the query text 'hadoop', with the assumption that discovery would automatically train itself to get the relevant results back (since I could not find any api like 'train()' that i was expecting). But, even after providing the training examples, discovery query still returns nothing.

I don't have any clue what's going wrong. Some help will be really appreciated.

Training Watson Discovery with (V1) python SDK APIs does not work

Answers (1)

Related Questions