Do I need to use learning algorithms for this type of data?

Question

I'm studying the behavior of machine failures in a production scenario. For this, I generated random data to form my imbalanced training set, consisting of categorical data, which indicate whether or not there was a failure in each subperiod. The failures were generated according to a exponential distribution. I have 100 features (Period_1 to Period_100), each containing information for 112 subperiods. My intention is to predict the behavior of the failures for the next period. However, I have two questions:

Is a learning algorithm really necessary to predict the next period failures or could some simpler statistical method be used?
The 100 features represent the same type of information, but for different periods. Would I have a problem with that?

Any help will be appreciated.
A sample of the data:
    Period_1 Period_2 Period_3 Period_4 Period_5 Period_6 Period_7 Period_8
1     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
2     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
3     Normal   Normal  Failure   Normal   Normal   Normal   Normal   Normal
4     Normal  Failure   Normal   Normal   Normal   Normal   Normal  Failure
5     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
6     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
7     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
8     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
9     Normal   Normal   Normal   Normal   Normal   Normal   Normal   Normal
10    Normal  Failure   Normal   Normal   Normal   Normal   Normal   Normal

In this moment, I'm using several classification methods to predict the failures: gradient boosting algorithm, random forest, Classification and Regression Trees, neural networks, Bagged CART, SVM, C5.0, eXtreme Gradient Boosting, and k-Nearest Neighbors. For all this, I use strategy to deal with the imbalance.

Do I need to use learning algorithms for this type of data?

Add your own answers!

Ask a Question