How to apply supervised machine learning when the target label depends on multiple input rows?

Question

The problem is a multi-label classification problem.

Now, I know how to train and classify using single row with several attributes. For example, if the dataset looks like the first table from the attached file. Here, each row is associated with a single label. Thus, I can train and test after separating the dataset into training and testing sets.
But the problem occurs when classification label / target label depends on multiple rows such as the second table from the attached file. Consecutive N rows makes one category.
Can you please guide me towards a solution?

Is it possible to fit this problem in any existing tool? For example, WEKA or Neural network using Keras.
Or do I have to change the algorithm in order to fit the problem! Is there any existing solution?
Or do I need to modify the rows in such a way that it transforms into one?

kabla002 · Answer

The second table is simply saying rows 1 - 4 are 4 different examples of class A, rows 5 - 8 are 4 separate example of class B and the rest are 4 examples of class C. Just modify the table so the target label column has 12 rows the first for having the value A, the next 4 having the value B and the final 4 having the value C.

Good luck!

Arnaud · Answer

What about doing a concatenation of your rows (i.e. Attr1 -> Attr12) , such that you now have 3*4 features (because 4 rows of 3 features) as an input to a multiclass classification model?

For instance, first sample would be described by :

X = [1.1, 1.4, 2.5, 2.3, 2.5, 2.7, 1.1, 1.6, 1.9, 1.5, 1.6, 1.7]
y = "A"

Otherwise, there is no issue in giving 2D or 3D inputs towards a classifier. Take for example convolutional neural networks that do operations on images!

Ben Reiniger · Answer

Since you seem to have the same number of rows per sample, perhaps the underlying process is such that it makes sense to treat the data as 2D or unpack into 12 features, as @Arnaud describes. (This seems to depend on the four rows being ordered according to some implicit rule?)

More generally though, this is called "multiple instance learning." Probably start with the wikipedia page, sections Assumptions and Algorithms.

More generally though, this is called "multiple instance learning."  Probably start with the wikipedia page, sections Assumptions and Algorithms.

How to apply supervised machine learning when the target label depends on multiple input rows?

3 Answers

Add your own answers!

Ask a Question