Choose CNN architecture first, then optimize parameters - validation vs test performance to pick architecture?

Question

I am doing a few experiments on medical data. I am about to transfer learn the pretrained networks for my problem.

Firstly, I have to pick a network architecture.
Secondly, I would like to optimize it's parameters/parameters of optimizer, to get better performance.

I would like to pick the network architecture based on 10-fold cross validation of several architectures. I will perform cross validation in a way that I have data split to train:test in a 80:20 manner, then train is split into 10 splits. Test set shouldn't ever change. Based on the cross validation, I would like to pick a model I would optimize further on validation set.

Is it okay to test the best architecture on test data too, to see whatever I am moving in a right direction, before optimizing the parameters or am I cheating?

The dataset is imbalanced with relatively high class variance, so I am not even sure whatever the 2 folds for testing will trully represent the dataset.

Thanks

Victor Oliveira · Answer

You shouldn't choose the best model based on the performance on test set. You should run cross-validation and pick the best model from there and then assess performance on a test set. Otherwise you would just getting the best model and overfitting your test set.

Regarding imbalanced data sets, you should be aware of metrics used to assess your model in this case, such as recall, precision, etc., and not only accuracy.
In python, you can do StratifiedKFold, where you will split your data getting the same proportion for train and test set regarding target variable.

Choose CNN architecture first, then optimize parameters - validation vs test performance to pick architecture?

One Answer

Add your own answers!

Ask a Question