Initializing neural network with singular vectors

Data Science Asked by robbmorganf on April 22, 2021

I’m thinking about ways to initialize my neural networks for faster convergence, and I was wondering about initializing the weights with the singular vectors of the data in order to immediately start with "useful" features. (I couldn’t find any such paper) Obviously that’s pretty vague, so specifically:

Example: for a CNN with $k$ $n$ by $n$ kernels and p images of $m$ by $m$, assemble the $p(m-n)(m-n) by n^2$ matrix A consisting of every window of the kernel (or a random subset thereof) and initialize the convolutional filters to the first k left singular vectors of A

My question is: would this create a stable convergence for training, or is it likely to lead to exploding or vanishing gradients? Further, would it be useful to identify features more quickly, or just lead to overfitting?

neural network

Add your own answers!

Ask a Question

Get help from others!

Recent Answers

Lex on Does Google Analytics track 404 page responses as valid page views?
haakon.io on Why fry rice before boiling?
Peter Machado on Why fry rice before boiling?
Jon Church on Why fry rice before boiling?
Joshua Engel on Why fry rice before boiling?