Predict correct answer among ten answers for a given question

Question

I have a case study to solve where I am given a dataset of questions and its answers, there are ten answers for a particular question.

It's a classification problem where correct answer is having class_label = 1 and all other nine answers having class_label = 0.

Which deep learning model would fit best for this type of case study and how should I proceed?

Erwan · Accepted Answer

It's not DL but I suggest you start with the following approach: for every question $Q$ and its set of candidate answers $(A_1,..,A_{10})$, represent each pair $(Q,A_i)$ as an instance with its label 0 or 1. You could start with a few simple features such as:

number of words in common
similarity score, e.g. cosine TFIDF
... other indicators of how well question $Q$ and answer $A_i$ match

Train a regression model on this (e.g. decision tree, SVM,...). When the model is applied to a new question+answers, it returns a score (mostly between 0 and 1) for each of the 10 answers; finally select the answer which has the highest score.

You can certainly improve on this idea, e.g. with sentence embeddings.

Note: a simple baseline system based on the same idea would be to select the answer which has the maximum number of words in common with the question.

vipin bansal · Answer

For me, RNN based model will be best suitable for such requirements.
Depending on your question length you may have to choose between different types of cell provided.

For high length, generally I use LSTM otherwise simple RNN cell can also serve the purpose.

If you try, please do share your results.

Allohvk · Answer

You would anyway start with word embeddings. Pre-trained vectors like Glove etc. Those are derived from DL algorithms. So whichever approach you now take, you already have an element of DL in your solution.
Let us now look at the possible approaches. One way would be definitely to use RNNs like bi-directional GRU/LSTM etc which will embed the sentence nicely into one final vector.
You could then bring 'Attention' into the scenario. 'Attention' help your model to focus on specific words when predicting the output (in this case selecting from one of the 10 answers)
You would also need to use an LSTM on the answer and sort of 'sentence-embed' each of the 10 answers.
So now you have a vector for the question, you have 10 answer vectors and you have attention logic.
How should the final layer actually 'choose' the answer? This is somewhat different from the typical SQuAD type of scenario. Here one does not have to generate the answer from the given context, but find out which answer is most applicable to the question. We could have a Dense layer which takes the question-embedding and transforms it in such a way that the dot product of this transformation with each of the answer embeddings is an indicator of the actual answer. A softmax could be taken to determine the most probable answer.
This is a new scenario from what we normally envisage when thinking of a QnA system, so I am sure the solution can be further improved. You can pick up some of the ideas from https://www.hindawi.com/journals/wcmc/2018/2678976/.
One can even try unsupervised methods like checking for K-nearest distance between the question and each of the 10 final answer, but while this may work for simple scenarios (where the answers are unrelated to each other), this approach may not work for complex ones

Predict correct answer among ten answers for a given question

3 Answers

Add your own answers!

Ask a Question