Data Science Asked by OultimoCoder on December 18, 2020
I have a model that predicts multiple choice answers to questions. I used an 80/20 train test split of my questions and tuned it.
The questions actually form part of a game aka 10 questions in a game. The data when split was randomly shuffled and split so the questions no longer form part of a game.
Can I now use the same questions it was trained and tested on to re-test all the questions but in order to determine the percentage of games won? Or when initially training my model should the train and test data be split by games instead of questions?
It is possible to re-use the train+test dataset.
You used the train dataset to train your parameters and the test to check if there was or not overfitting in your model.
The train/test split is used to check your if your model is likely to suffer from overfitting or not, more than to obtain the "real parameters".
So, in that order of ideas, you discarded your model's overfitting and then you could use your complete dataset to recalibrate your parameters.
Remember the decision of wheter an individual belongs to train or test is arbitrary.
Answered by Juan Esteban de la Calle on December 18, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP