Data Science Asked by Stepan on December 19, 2020
I want to train a cnn model on a large dataset (10K images and masks).
Currently, I am reading data in batches of BATCH_SIZE = 500
images, augment with
def get_augmented
return get_augmented(
x_train, y_train, batch_size=BATCH_SIZE,
data_gen_args=dict(
rotation_range=5.,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=40,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip=False,
fill_mode='constant'
))
The main loop looks like
STEPS_PER_EPOCH = 2
INNER_EPOCHS = 2
EPOCHS = 10
model = init_model(model_filename) # Define the model
for epoch in range(EPOCHS): # Number of times I want to iterate over the full dataset
for batch_id in range(0,20): # number of chunks of 500 images
# reads images and masks from disk from batch_id folder
x_train, y_train, x_val, y_val = read_a_butch_of_masks_and_images(batch_id)
history = model.fit(
train_gen,
steps_per_epoch=STEPS_PER_EPOCH,
epochs=INNER_EPOCHS,
validation_data=(x_val, y_val),
callbacks=[callback_checkpoint]
)
model.save_weights(new_model_filename)
How to tell get_augmented
to read the next batch of images from disk, rather than consume x_train, y_train
so that it consumes little memory and goes over all the images??
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP