Data Science Asked by steam_engine on May 2, 2021
Suppose I want to develop and train a big end-to-end deep learning model using Tensorflow (1.15, for legacy reasons). The objects are complex, with many types of features that can be extracted: vector of numeric features of fixed length, sequences, unordered sets, etc. Thus, the model will include many submodules to deal with various types of features.
I have access to a server with several GPUs, so I want to distribute the model across them. What is the best way to do so? So far I’m thinking about placing subsystems on separate GPUs, but this presents some questions:
I invite you to look at the Horovod project on github. It's the most efficient way to currently execute distributed training with tensorflow. They have tutorials and benchmarks resource available
Answered by Jonathan DEKHTIAR on May 2, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP