Data Science Asked by Yassir on October 3, 2021
I’m trying to run a deep model using GPU and seems Keras running the validation against the whole validation data set in one batch instead of validating in many batches and that’s causing out of memory problem
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[160000,64,64,1] and type double on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:GatherV2]
I did not have this problem when I was running on CPU, it’s just happening when I’m running on GPU, my fit code looks like this
history =model.fit(patches_imgs_train,patches_masks_train, batch_size=8, epochs=10,
shuffle=True, verbose=1, validation_split=0.2)
When I delete the validation parameter from the fit method the code works, but I need the validation.
There may be two causes to your problem:
In this Keras issue you can find a discussion of a very similar problem. Basically, you can try:
batch_size
parameter.Answered by noe on October 3, 2021
So I could consider what is happening as a bug in Keras implementation, looks like it's trying to load the whole data set to the memory for splitting it into validation and training sets and it's not related to batch size, after trying many ways to go around it I found the best way to approach it is splitting the data using sklearn train_test_split instead of splitting it down in the fitting method using validation_split param.
x_train, x_v, y_train, y_v = train_test_split(x,y,test_size = 0.2,train_size =0.8)
history = model.fit(x_train,y_train,
batch_size=16,
epochs=5,
shuffle=True,
verbose=2,
validation_data=(x_v, y_v))
Answered by Yassir on October 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP