Data Science Asked by coobit on January 14, 2021
I’m building an autoencoder for natural language words.
The question: Why do I see at the start of each epoch a sudden drop in the loss and then gradual bounce back (loss raises but slowly) followed by a gradual decrease in the loss until another epoch?
The graph of the loss looks a bit like this:
(it’s a mock i’ve made in paint brush:)
Here is the model setup:
#input is a (35,35) matrix of one-hot vectors (one vector for each char in a word)
inputs = layers.Input(shape=(35,35), name='input')
encoded = layers.Masking(mask_value=0., input_shape=(35, 35))(inputs)
encoded = layers.Bidirectional(layers.GRU(256,return_sequences=True))(encoded)
encoded = layers.GRU(32)(encoded)
decoded = layers.RepeatVector(35)(encoded)
decoded = layers.GRU(32,return_sequences=True)(decoded)
decoded = layers.Bidirectional(layers.GRU(256,return_sequences=True))(decoded)
decoded = layers.TimeDistributed(layers.Dense(35,activation='softmax'))(decoded)
model = keras.Model(inputs, decoded, name="toy_model")
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['categorical_crossentropy'])
model.fit(x_train, y_train, shuffle=True, batch_size=32, epochs=10)
p.s. As far as I understand I do shuffle data on each epoch (see the model.fit(shuffle=True))
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP