Data Science Asked by HannesZ on April 10, 2021
I am working on predictive models with ML using very roughly 10-50 million records (currently testing with less) and around 10 explanatory variables per model.
When outlining hardware requirements for a good VM-setup, it is often difficult for me to say, which additional computational power improves how well. When I look at the task manager (working on a Windows 64bit machine) during training time of my XGBoost models, the CPU is always at 100% using all the cores.
Looks like parallelization works fine, but the training still takes a lot of time.
It seems logical for me to thus ask primarily for more CPU (that is many more cores). However memory does also play a role, because, if I understand it correctly, the algorithm stores/compresses data differently, if less memory is available.
Here is my question: should I appreciate the offer for a considerable amount of additional memory (that seems easier to get) and be okay with just a modest improvement on the CPU side, or should I ignore all that just ask for more CPU?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP