Data Science Asked by user103134 on December 9, 2020
I’m looking to train an Electra model using unlabelled data in a specific field. Are there any objections to using the same data for unsupervised learning and then using the same data downstream for the supervised learning task?
Not at all. A recent ACL paper by AllenAI even says this is the best way. They recommend continuing pre-training on the task data and claim that it reduces the problems caused by domain mismatch. So, if you train the model on the in-domain data from the very beginning, it is probably a good thing given you have enough data for that.
Answered by Jindřich on December 9, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP