Data Science Asked by Hing Wong on November 30, 2020
Is smoothing in NLP ngram done on test data or train data?
Since smoothing is to avoid the language model predicting 0 probability of unseen corpus (test). So I wonder is smoothing done on test data only? Or on train data only? Or both? I don’t seem to find an answer to this yet.
Is smoothing in NLP ngram done on test data or train data?
In short: both.
Smoothing consists in slightly modifying the estimated probability of an n-gram, so the calculation (for instance add-one smoothing) must be done at the training stage since that's when the probabilities of the model are estimated.
But smoothing usually also involves differences at the testing stage, in particular for assigning a probability to unknown n-grams instead of 0.
Answered by Erwan on November 30, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP