Data Science Asked on January 20, 2021
Im doing preprocessing on a text dataset. I have certain numerics in it like:
Is it recommended to discard this numerics before creating a vectorizer(bow/tf-idf) for any model(classification/regression) development?
Any quick help on this is much appreciated. Thank you
Is it recommended to discard this numerics before creating a vectorizer(bow/tf-idf) for any model(classification/regression) development?
It depends on the problem statement for example year could be significant if you want to find the trend and year has many unique value but if it's constant then you can remove it.
To add to that if you are doing sentiment analysis then numeric variables don't make much sense.
Answered by prashant0598 on January 20, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP