Version of Perceptron

Question

If we change the $ywx<0$ condition (for performing update) to $ywx<1$ like in SVM (but without adding regularization to maximize the margin), is there any difference from the basic perceptron (the one with the aforementioned $ywx<0$ condition)?

ValleyCrisps · Answer

Old question, but in case anyone is still interested in an answer...

In the perceptron algorithm a point $x$ has a label $y$ equal to 1 or -1. The predicted label for a point is $wcdot x$. The goal is to separate the points with an hyperplane orthogonal to w in such a way that the points with label 1 are on one side and the points with label -1 are on the other side. Mathematically, the "sides" of the hyperplane are characterised by the equations $wcdot x>0$ and $wcdot x<0$, respectively. ($wcdot x=0$ is the equation of the hyperplane itself).
Therefore $x$ point is correctly labelled (it is on the right side of the hyperplane orthogonal to $w$) if

$y=1$ and $wcdot x>0$, or
$y=-1$ and $wcdot x<0$

In both cases, a correctly labelled point corresponds to $ywcdot x > 0$, while an incorrectly labelled points corresponds to $ywcdot x leq 0$.
The whole algorithm and the proof of its convergence are based on this simple observation.

If we change the condition for update to $ywcdot x < 1$, the algorithm would not be able to find a separating hyperplane. For example, consider the simple dataset $x_1 = (2,0), y_1= 1$ , $x_2 = (-2,1), y_2= -1$.

Start: $w=(0, 0)$.
Step 1: $y_1wcdot x_1 = 0 < 1$ therefore $w leftarrow w + y_1x_1 = (2, 0)$
Step 2: $y_2wcdot x_2 = 4 > 1$, do not update $w$.
Step 3: $y_1wcdot x_1 = 4 > 1$, do not update $w$. We went over all points without updating $w$, therefore exit the loop.
Result: the hyperplane orthogonal to $w=(2, 0)$ is the y-axis, which clearly does not separate $x_1$ from $x_2$.

TLDR; Update conditions different from $ywcdot x leq 0$ do not offer any guarantees for finding a separating hyperplane.

Version of Perceptron

One Answer

Add your own answers!

Ask a Question