Data Science Asked on March 7, 2021
I have been practicing with the following dataset:
http://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength
for building a prediction model based on a MLP, but I have some doubts if the approach followed is the correct. I wanted to tune the activation function based on the following models: identity, logistic, tanh and relu. So what I did is the following:
First I divided my dataset in 80/20/20 for training, validation and test; and for what I know the hyperparameter tuning is in the validation set. So my pseudocode for the validation part is like the following:
for each item in activation function list
model=MLP(activation=item,solver="adam") with 1000 iterations
fit(Xtrain,ytrain)
plot(lossCurve) with the training data
fit(Xval,yval)
plot(lossCurve) with the validation data
end for
with this loop I found that the "best" activation function was "relu". I am putting two graphs as an example:
After that I got "adam" and "relu" as hyperparameters and the I tried them with the training and testing set, so roughly I did this:
model=MLP(activation="relu",solver="adam")
fit(Xtrain,ytrain)
plot(lossCurve) with the training data
fit(Xtest,ytest)
plot(lossCurve) with the test data
and the curve I get was the following:
What I wanted to know is if my approach is the correct. I ask this because it is not so easy to find examples of loss curves using scikit. I think because in many Internet tutorials the hyperparameter tuning is made by GridSearch or using CV, and the ones that use loss curves are implemented en Keras or TensorFlow.
I wanted to force my model to obtain a curve like this:
which is an overfitted model and just for the sake of learning. So I was wondering, do all models overfit? or what is happening in my tests? Maybe I made something wrong.
Any help would be greatly appreciated.
Thanks
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP