My weight matrix converged to zeros

Question

So I was training a fairly shallow convnet, because my deepnet based on vgg19 wasn't working. 2 conv layers and 2 flat layers, the second flat layer was the output.

It converged quickly to all zeros in the second conv layer. First conv layer wasn't all zeros. First flat layer appears to have learned the ditribution of classes.

So it appears that my network's strategy os to ignore the inputs and just predict the class distribution. I tried class weights, same result just a more uniform distribution gets predicted.

I changed the learning rate, optimizer, and even tried gradient clipping. I augmented the data, and introduced regularization on the data, and network via dropout layers. No luck. Same result.

Why would my network exhibit this behavior and what can be done about it?

EDIT:

Here is a code snippet that defines the network that I am using:

def discriminator_model():

model = Sequential()
model.add(Conv2D(32, 
                 (10, 10),
                 strides=(2,2),
                 input_shape=( 256, 256, 3),
                 kernel_initializer='random_uniform',
                 bias_initializer='zeros'))
#model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(228, activation='sigmoid'))
return model

After training I was trying to see what the layers had actually learned with the following code:

from keras import backend as K

inp = d.input                                           # input placeholder
outputs = d.layers[3].output
functors = K.function([inp], [outputs])
print(functors([x])[0].shape)
arr = functors([x])[0].reshape(64,29,29,1)
for i in range(32):
     #p = (255./(np.max(arr[i])-np.min(arr[i])))*(arr[i]-np.min(arr[i]))
     p=255.0*arr[i]
     plt.imshow(p.reshape(29,29))
     plt.show()

Here is an example of what I am getting as output from the

And here is example output from the first layer for comparison, which I generated with similar code:

Also tried changing loss functions, removing pooling layers, etc. I know it is weird. I've never come across anything quite like this before.

My weight matrix converged to zeros

Add your own answers!

Ask a Question