Data Science Asked on July 29, 2021
I have a train and a test set and no development (dev) set. I’m training a model on the train set and searching for the best hyperparameters that can eventually maximize the accuracy of the test set (pretty much a normal machine learning scenario). Here is my confusion: we usually do the hyperparameter tuning on the development set (not test set) to find the best hyperparameters, then we use those best hyperparameters to train our model and finally test it on the test set. I have two questions though when there is no dev set:
This is very nearly a duplicate of Is a test set necessary after cross validation on training set?, but I think it's worth addressing specifically this part of your question:
the hyperparameter searching is like every time training a model from scratch using a combination of hyperparameters and pick the best combination, and it's not like that the model is learning from previous hyperparameter searches
Indeed the various models don't gain any direct information from each other, either from training or the dev set. However, you are choosing the hyperparameters that perform best on the dev set, so the whole pipeline can become "overfit" to the dev set. The size of that effect is most often not too pronounced, but you certainly cannot report the score of the chosen hyperparameters as an unbiased estimate of future performance.
The answer to your second question is then "yes"; and if your training set is small, consider a repeated split cross-validation like k-fold.
Answered by Ben Reiniger on July 29, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP