Data Science Asked on June 16, 2021
I am currently writing a machine learning pipeline for my time series application. At the end of each month, I get the data gathered, normalize it ([0, 1]), retrain the ML model with the new observation only and predict future values.
Should I be reading the entire dataset each time I get a new Observation, normalize the entire dataset, create the ML model, then predict?
How I got stuck:
Thank you
Really depends
Why? updating everything in production (pre-processing, fitting etc) can get extremely expensive. If you have some complex architecture it is not worth it.
Alternatives
Approximate covariate shift if you know distribution of your future data you can adjust all your, for example normalisation parameters, in advance.
Save your you future data every time you make prediction, it could be cheaper to quickly save your data in DB and depending on your system do updates weekly,monthly
Answered by vienna_kaggling on June 16, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP