Determining a correct ML approach

Question

I've little idea about choosing a ML approach for the following problem. It is a classification problem and there are 2 classes that are positive and negative. There are about 100k samples and samples are structured like this:
Period = 1min   Pattern = M1>S1>T1>B2>M2>S2>S3>T3>M3>B3
Period = 5min   Pattern = S1>M1>T1>B2>S2>M2>S3>T3>M3>B3
Period = 10min  Pattern = M1>T1>S1>M2>B2>S2>S3>T3>M3>B3
Period = 15min  Pattern = M1>T1>S1>B2>S3>M2>S2>T3>M3>B3
Period = 20min  Pattern = S1>M1>S3>T1>B2>M2>S2>T3>M3>B3
Period = 30min  Pattern = S1>S3>B2>M1>T1>S2>M2>T3>M3>B3
Period = 60min  Pattern = S1>B2>M1>T1>S2>M2>S3>T3>B3>M3
Period = 120min Pattern = S1>M1>T1>B2>S2>M2>T3>S3>M3>B3

This sample is classified as negative. A sample is composed of 8 periods. Within each period there is a pattern such as M1>S1>T1>B2>M2>S2>S3>T3>M3>B3. Each pattern has 10 elements and their positions are changing along samples and periods. We need to come up with a solution that could tell which period or lineup of elements are responsible for classification.
Let's say we have p1, p2, p3 positive examples and n1, n2, n3 negative examples with 1min Periods like this:
p1: M1>S1>T1>B2>M2>S2>S3>T3>M3>B3
p2: M1>S1>T1>B2>S2>M2>S3>T3>M3>B3
p3: M1>S1>T1>B2>M2>S2>S3>T3>M3>B3

n1: M1>S1>T1>B2>S2>M2>S3>T3>B3>M3
n2: M1>S1>T1>B2>M2>S2>S3>T3>B3>M3
n3: M1>S1>T1>B2>M2>S2>S3>T3>B3>M3

It could be inferred that first 4 elements M1,S1,T1, B2 are irrelevant for classification since they are all same across all samples. 5th and 6th elements are also irrelevant since they don't show same pattern along same class of samples. However, elements B3, M3 is a solid positive since M3>B3 for positive samples and B3>M3 for negative samples.
Thanks.

David Masip · Answer

I think all you need is to build proper features.
For each period and element, I would build a categorical feature. That is 80 categorical features. It looks like there aren't many possible values by feature, let's say there are 3 or 4 possible values for each feature, by doing one-hot encoding you would end up with 240-320 features.
Then you can do some kind of feature selection, like Lasso, and train your model with the selected features.

Determining a correct ML approach

One Answer

Add your own answers!

Ask a Question