Modify keras_unet.utils.get_augmented to read images from disk

Question

I want to train a cnn model on a large dataset (10K images and masks).
Currently, I am reading data in batches of BATCH_SIZE = 500 images, augment with
def get_augmented
    return get_augmented(
        x_train, y_train, batch_size=BATCH_SIZE,
        data_gen_args=dict(
            rotation_range=5.,
            width_shift_range=0.05,
            height_shift_range=0.05,
            shear_range=40,
            zoom_range=0.2,
            horizontal_flip=True,
            vertical_flip=False,
            fill_mode='constant'
        ))

The main loop looks like
STEPS_PER_EPOCH = 2
INNER_EPOCHS = 2
EPOCHS = 10

model = init_model(model_filename)  # Define the model
for epoch in range(EPOCHS):  # Number of times I want to iterate over the full dataset
    for batch_id in range(0,20):  # number of chunks of 500 images
    
        # reads images and masks from disk from batch_id folder
        x_train, y_train, x_val, y_val = read_a_butch_of_masks_and_images(batch_id)   
        
        history = model.fit(
            train_gen,
            steps_per_epoch=STEPS_PER_EPOCH,
            epochs=INNER_EPOCHS,
            validation_data=(x_val, y_val),
            callbacks=[callback_checkpoint]
        )

model.save_weights(new_model_filename)

How to tell get_augmented to read the next batch of images from disk, rather than consume x_train, y_train so that it consumes little memory and goes over all the images??

Modify keras_unet.utils.get_augmented to read images from disk

Add your own answers!

Ask a Question