Data Science Asked by huy on December 16, 2020
I have some products, along with their description. I wish to assign USPSC code to each product. I have a really basic doubt here. What exactly is my test file and training file? Eg. Should the training file be entries of product description along with manually entered codes assigned to each product? And the test file only product descriptions?
So the question is what should the training and test sets consist of for classifying products by their product description.
To clarify, for any machine learning task we require data. In the case of supervised machine learning, the dataset will be labelled (in your case with USPSC codes). The dataset is then split into a training, (validation if possible) and test set.
So the answer to your question would be that both trading and test sets will have the same format in that they contain the input text which will go into the model. The only real difference is that the labels in the training set are observed so that the model can find an underlying function which translates the text into a class. In the test set, the labels are used to evaluate the model.
Correct answer by shepan6 on December 16, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP