TransWikia.com

Multi GPU in Keras

Data Science Asked by Hector Blandin on March 24, 2021

How we can program in the Keras library (or TensorFlow) to partition training on multiple GPUs? Let’s say that you are in an Amazon ec2 instance that has 8 GPUs and you would like to use all of them to train faster, but your code is just for a single CPU or GPU.

4 Answers

From the Keras FAQs, below is copy-pasted code to enable 'data parallelism'. I.e. having each of your GPUs process a different subset of your data independently.

from keras.utils import multi_gpu_model

# Replicates `model` on 8 GPUs.
# This assumes that your machine has 8 available GPUs.
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')

# This `fit` call will be distributed on 8 GPUs.
# Since the batch size is 256, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=20, batch_size=256)

Note that this appears to be valid only for the Tensorflow backend at the time of writing.

Update (Feb 2018):

Keras now accepts automatic gpu selection using multi_gpu_model, so you don't have to hardcode the number of gpus anymore. Details in this Pull Request. In other words, this enables code that looks like this:

try:
    model = multi_gpu_model(model)
except:
    pass

But to be more explicit, you can stick with something like:

parallel_model = multi_gpu_model(model, gpus=None)

Bonus:

To check if you really are utilizing all of your GPUs, specifically NVIDIA ones, you can monitor your usage in the terminal using:

watch -n0.5 nvidia-smi

References:

Correct answer by weiji14 on March 24, 2021

  1. For TensorFlow:

TensorFlow Using GPUs

Here is the sample code on how is used, so for each task is specified the list with devices/device:

# Creates a graph.
c = []
for d in ['/gpu:2', '/gpu:3']:
  with tf.device(d):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3])
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2])
    c.append(tf.matmul(a, b))
with tf.device('/cpu:0'):
  sum = tf.add_n(c)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print(sess.run(sum))

tf will use GPU by default for computation even if is for CPU (if is present supported GPU). so you can just do a for loop: "for d in ['/gpu:1','/gpu:2', '/gpu:3' ... '/gpu:8',]:" and in the "tf.device(d)" should include all your instance GPU resources. So tf.device() will actually be used.

Scaling Keras Model Training to Multiple GPUs

  1. Keras

For Keras by using Mxnet than args.num_gpus, where num_gpus is the list of your required GPUs.

def backend_agnostic_compile(model, loss, optimizer, metrics, args):
  if keras.backend._backend == 'mxnet':
      gpu_list = ["gpu(%d)" % i for i in range(args.num_gpus)]
      model.compile(loss=loss,
          optimizer=optimizer,
          metrics=metrics, 
          context = gpu_list)
  else:
      if args.num_gpus > 1:
          print("Warning: num_gpus > 1 but not using MxNet backend")
      model.compile(loss=loss,
          optimizer=optimizer,
          metrics=metrics)
  1. horovod.tensorflow

On top of all Uber open sourced Horovod recently and I think is great:

Horovod

import tensorflow as tf
import horovod.tensorflow as hvd

# Initialize Horovod
hvd.init()

# Pin GPU to be used to process local rank (one GPU per process)
config = tf.ConfigProto()
config.gpu_options.visible_device_list = str(hvd.local_rank())

# Build model…
loss = …
opt = tf.train.AdagradOptimizer(0.01)

# Add Horovod Distributed Optimizer
opt = hvd.DistributedOptimizer(opt)

# Add hook to broadcast variables from rank 0 to all other processes during
# initialization.
hooks = [hvd.BroadcastGlobalVariablesHook(0)]

# Make training operation
train_op = opt.minimize(loss)

# The MonitoredTrainingSession takes care of session initialization,
# restoring from a checkpoint, saving to a checkpoint, and closing when done
# or an error occurs.
with tf.train.MonitoredTrainingSession(checkpoint_dir=“/tmp/train_logs”,
                                      config=config,
                                      hooks=hooks) as mon_sess:
 while not mon_sess.should_stop():
   # Perform synchronous training.
   mon_sess.run(train_op)

Answered by n1tk on March 24, 2021

Basically, you can take example of the following example. All you need is specifying cpu and gpu consumption values after importing keras.

import keras

config = tf.ConfigProto( device_count = {'GPU': 1 , 'CPU': 56} )
sess = tf.Session(config=config) 
keras.backend.set_session(sess)

After then, you would fit the model.

model.fit(x_train, y_train, epochs=epochs, validation_data=(x_test, y_test))

Finally, you can decrease the consumption values not the work on upper limits.

Answered by johncasey on March 24, 2021

Simple example for how we can access multiple GPUs with Horovd and Keras: Github code Keras MNIST Example with Horovod.

Plus, please go to the link for further info: Horovod with Keras

Answered by Muhammad Shifa on March 24, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP