Data Science Asked by Philippe Fanaro on June 24, 2021
My dataset is composed of an idle system that, at some time instants, receives requests. I’m trying to predict these instants through a clock. Since the requests are sparsely distributed (I’ve forced them to last for a while so they don’t get too sparse), I wanted to create a new loss function that would penalize the model if it only gives out a zero prediction for everything. My implementation attempt is just a penalty for the standard logits:
def sparse_penalty_logits(y_true, y_pred):
penalty = 10
if y_true != 0:
loss = -penalty*K.sum((y_true*K.log(y_pred) + (1 - y_true)*K.log(1 - y_pred)))
else:
loss = -K.sum((y_true*K.log(y_pred) + (1 - y_true)*K.log(1 - y_pred)))
return loss
Is it correct? (I have also tried it with tensorflow
). Every time I run it I either get a lot of NaN
‘s as the loss or predictions that are not binary at all. I wonder if I’m doing something wrong at setting up the model also because binary_crossentropy
is not working properly either. My model is something like this (the targets are represented by a column with either 0
‘s or 1
‘s):
model = Sequential()
model.add(Dense(100, activation = 'relu', input_shape = (train.shape[1],)))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
model.compile(optimizer = 'adam', loss = sparse_penalty_logits)
If I run it, as I said, I get very strange results (boy, do I feel like I’ve messed up real bad…):
From the mentioned problems you are facing, this seems like a problem of exploding gradients. The exploding gradients problem can be identified by:
More about Exploding gradient problem can be found at this article
I would suggest you to use some gradient clipping technique in you code and this will remove the NaN generation during model training.
Answered by thanatoz on June 24, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP