Data Science Asked on September 29, 2021
I’m studying the behavior of machine failures in a production scenario. For this, I generated random data to form my imbalanced training set, consisting of categorical data, which indicate whether or not there was a failure in each subperiod. The failures were generated according to a exponential distribution. I have 100 features (Period_1 to Period_100), each containing information for 112 subperiods. My intention is to predict the behavior of the failures for the next period. However, I have two questions:
Any help will be appreciated.
A sample of the data:
Period_1 Period_2 Period_3 Period_4 Period_5 Period_6 Period_7 Period_8
1 Normal Normal Normal Normal Normal Normal Normal Normal
2 Normal Normal Normal Normal Normal Normal Normal Normal
3 Normal Normal Failure Normal Normal Normal Normal Normal
4 Normal Failure Normal Normal Normal Normal Normal Failure
5 Normal Normal Normal Normal Normal Normal Normal Normal
6 Normal Normal Normal Normal Normal Normal Normal Normal
7 Normal Normal Normal Normal Normal Normal Normal Normal
8 Normal Normal Normal Normal Normal Normal Normal Normal
9 Normal Normal Normal Normal Normal Normal Normal Normal
10 Normal Failure Normal Normal Normal Normal Normal Normal
In this moment, I’m using several classification methods to predict the failures: gradient boosting algorithm, random forest, Classification and Regression Trees, neural networks, Bagged CART, SVM, C5.0, eXtreme Gradient Boosting, and k-Nearest Neighbors. For all this, I use strategy to deal with the imbalance.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP