TransWikia.com

Which machine learning problem is this?

Data Science Asked on March 1, 2021

I am not able to figure out what kind of machine learning is this:

Training set: consists of sentences with object labels for object phrases
Example:

"This is a black chair. It is next to a large bed."

Phrases 'This', 'black chair', 'it' were annotated with the label "chair" and phrase 'large bed' was annotated with the label 'bed'. There are 18 available labels that can be assigned.

For an unannotated sentence, I want to predict the labels for each phrase in the sentence.

For example:

There is a study table in the corner of the room, behind it is a small chair.

I would like the model to predict a label (available 18 labels) for each phrase that represents an object in the above sentence.

Expected output:

'study table', 'it' -----> label 'table'

'small chair' -----------> label 'chair'

Is this a classification, regression, or a different kind of problem?

3 Answers

Technically this is sequence labeling, the most common application being Named Entity Recognition.

However it looks like in this case you're trying to solve a problem of coreference resolution, which is a quite difficult task in general. I think this usually involves a more complex model than simple sequence labeling, but I'm not an expert in this. You might want to search around this topic, there are certainly some relevant papers and tools about it.

Answered by Erwan on March 1, 2021

There are several different ways to frame the problem.

One way is multiclass classification. The goal would be to assign a single discrete label to every phrase. In order to get phrases, you'll have to build a parse tree first. You did not list of all of the labels but let's assume they are all nouns. Then you'll need a Part-Of-Speech Tagger (POS Tagger) to find all noun phrases. Each noun phrase can be predicted to belong a class.

Answered by Brian Spiering on March 1, 2021

@Erwan probably gave you a better idea, but for a simpler problem, when each token has a separate class, this can be viewed as a Part of Speech Tagging, which is essentially a per-token multiclass problem. This assumes each token can be mapped only to one correct class. In such case the LSTM's output would be size, e.g. (1, seq_length, hidden_dim), which you reshape to (seq_length, hidden_dim), and seq_length becomes the batch dimension. You feed this output into a linear layers, to get a batch of outputs size seq_length. Now you can use NegativeLogLikelihood loss function for this batch. For each token you will have a one-hot vector (multiclass problem) of labels.

Answered by Alex on March 1, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP