Samyak Jain
Samyak Jain

Reputation: 300

Building a closed domain QA system using LSTM

My objective is to build a closed domain question answering system from a set of documents containing knowledge about the domain. I have gone through a bunch of research papers which implement such a system using Recurrent Neural Networks (mostly, LSTM). All these papers have training data in the Question-answer pairs format, which is not the case with my problem. Also, its not feasible to generate QA pairs from my document corpus.

Is there any research paper or any method I can follow to achieve this? (Preferably using LSTM or any neural network, because they tend to give better results. If not, any other method should also do.)

Upvotes: 3

Views: 1859

Answers (4)

Anirban Saha
Anirban Saha

Reputation: 111

You can check out the Haystack library for closed domain question answering at scale.

Upvotes: 0

Vincenzo
Vincenzo

Reputation: 6358

As mentioned by @erup this cdQA python library is the solution I'm going for. I searched and read about anything on the subject, and tried to implement whatever found. Due to my lack of python knowledge I had poor results, also because what I found before is not ready for Tensorflow 2.0 so needs to be adapted, task where I terribly failed and the library's writers didn't bother to help.. I need to make it on TF 2.0 to be able to be served by Tensorflow serving, which needs a new written set of transformers, made by huggingface .https://github.com/huggingface/transformers

cdQA is just built upon that so it's just the perfect library out of the box. Not being entirely sure about what I'm doing in python, the cdQA is making me make giant leap forward,. Their suit of tools covers everything including adapting your own dataset to the SQuAd format. Dough explanation about implementing it is, at least for me , quite unuseful, THey have 3 colab Notebooks that really explain everything.. I just fell in love with it.

Also check this article on it

https://towardsdatascience.com/how-to-create-your-own-question-answering-system-easily-with-python-2ef8abc8eb5

Give it a try

PS: I just recalled, I found a typo in the line cdqa_pipeline.fit_retriever(X=df)

X=df throws an error so the correct is cdqa_pipeline.fit_retriever(df)

Upvotes: 0

erup
erup

Reputation: 193

You can check cdQA python library which allows you to build custom Closed Domain Question Answering Systems using transfer learning: https://github.com/cdqa-suite/cdQA

Upvotes: 0

Daniel
Daniel

Reputation: 6039

I'd first give try to lucene (say with elasticsearch). It is very easy to implement, especially in closed domain. First you collect documents that explain your domain (Like a sentence in a row) and index it with elasticsearch, and query it from your favorite language. Many papers (e.g. this) show that this is a very competitive approach for fast and easy QA.

About trying neural networks. First, don't be fooled by the hype. Second, try this demo for your examples. Their already trained model might already work well for you.

Upvotes: 1

Related Questions