Stack Overflow Asked by Obsidian on February 10, 2021
Suppose the model is originally stored on CPU, and then I want to move it to GPU0, then I can do:
device = torch.device('cuda:0')
model = model.to(device)
# or
model.to(device)
What is the difference between those two lines?
Citing the documentation on to
:
When loading a model on a GPU that was trained and saved on GPU, simply convert the initialized model to a CUDA optimized model using
model.to(torch.device('cuda'))
. Also, be sure to use the.to(torch.device('cuda'))
function on all model inputs to prepare the data for the model. Note that callingmy_tensor.to(device)
returns a new copy ofmy_tensor
on GPU. It does NOT overwritemy_tensor
. Therefore, remember to manually overwrite tensors:my_tensor = my_tensor.to(torch.device('cuda'))
.
Mostly, when using to
on a torch.nn.Module
, it does not matter whether you save the return value or not, and as a micro-optimization, it is actually better to not save the return value. When used on a torch tensor, you must save the return value - seeing you are actually receiving a copy of the tensor.
Ref: Pytorch to()
Answered by Mano on February 10, 2021
No semantic difference. nn.Module.to
function moves the model to the device.
But be cautious.
For tensors (documentation):
# tensor a is in CPU
device = torch.device('cuda:0')
b = a.to(device)
# a is still in CPU!
# b is in GPU!
# a and b are different
For models (documentation):
# model a is in CPU
device = torch.device('cuda:0')
b = a.to(device)
# a and b are in GPU
# a and b point to the same model
Answered by youkaichao on February 10, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP