Data Science Asked on March 14, 2021
A single non activated neuron is just a linear combination of its inputs.
Thresholding this neuron’s output as-is against 0 would create a hyperplane binary separator, whose parameters can be learned.
What would be the shape of this separator (decision boundary) in case we also take a relu on the output and only then threshold?
I keep thinking it has to be non linear, otherwise NNs wouldn’t work, but if something was positive before, it would remain positive, and if it were negative or zero, it would stay, and so the relu actually did nothing for the decision boundary, which makes no sense to me.
If someone could make order of this for me I would be glad.
As a follow up, I would like to understand multi level descision boundaries – say on a 2 level network. What would the boundary look like with 2 neurons per layer, on a 2 layer network, for, say, a XOR-like dataset?
What would be the shape of this separator (decision boundary) in case we also take a relu on the output and only then threshold?
For just a single neuron, indeed the decision boundary will just be a plane.
I keep thinking it has to be non linear, otherwise NNs wouldn't work...
Single-layer neural networks don't demonstrate the same nonlinearity as multi-layer networks, only as much as an activation function. (With sigmoid activation, you're just doing logistic regression.)
As a follow up, I would like to understand multi level descision boundaries...
With multiple layers, you gain the ability to combine boundary planes into polytopal regions. A ReLU-only neural network represents a piecewise-linear function, with finitely many polytopal (i.e., flat-edged) pieces. That's clear for every neuron in the first layer. In each subsequent layer, you take a linear combination of such functions, which is again such a function; applying a ReLU to that then can cut each region in two, but the result is still such a function. (You may also be interested in this paper, and this expository blog post.)
You can see some of this in the tensorflow playground. In a version of that link, I've grabbed the weights and put the resulting function into Wolfram|Alpha. You can see how the boundary lines of each first-layer neuron shows up as a kink, and each of the resulting regions supports a linear function. Then slicing at a fixed height gives you the hexagonal region that works well enough to classify the circle dataset. Increasing the number of layers or neurons per layer allows you to cut the space into more regions.
Then, even if the output neuron applies a sigmoid as in your tensorflow playground, the decision boundary is just a level set of a piecewise-linear function with all linear boundaries, which is itself linear.
Correct answer by Ben Reiniger on March 14, 2021
This is a very interesting question about the decision boundary of a ReLU activated neuron.
ReLU is a non-linear function because the function differs depending on the input.
$ReLU(x) = begin{cases}0 , xleqslant0 x, x > 0 end{cases}$
We have to think of the linear layer and ReLU layer as a series of matrix multiplications, which are applied to the input space.
The linear layer is, as you described, a linear transformation over the input space and so the decision boundary should be linear. The ReLU is a non-linear transformation, and thus the decision boundary should be non-linear.
Answered by shepan6 on March 14, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP