Cross Validated Asked by shiladitya basu on November 30, 2020
I was going through the CS229 notes on SVM and Kernel tricks and I came across the following line.
More generally the kernel $K(x,z)=(xTz+c)^d$ corresponds to a feature
mapping to an $n+dchoose d$ feature space, corresponding to all monomials
that are up to order d. Despite working in this $O(n^d)$ dimensional
space, computing $K(x,z)$ is of order $O(n)$.
Firstly, How exactly does it translate into $n+dchoose d$ feature space?
Consider I have $n = 3$ and $d = 2$, i.e, $x = [x1, x2, x3], z = [z1, z2, z3]$
so, a feature map for $K(x,z) = (xTz + c)^2$ would look something like this:
$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_1, x_2^2, x_2x_3, x_3x_1, x_3x_2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$
which makes a total of 13 features. But $3+2choose 2$ gives me 10. It does not make sense to me.
Secondly,
Despite working in this $O(n^d)$ dimensional space
Why does it say $n^d$ dimensional space whereas we had feature mapped to 13 dimensions?
Are we then only considering the monomials $x_{i1}x_{i2}…x_{ip}$ which make up order d = 2? (i.e, $x_1^2$ or $x_1x_2$ etc).
If that is the case, then what is this all about?
the kernel $K(x,z)=(xTz+c)^d$ corresponds to a feature mapping to an $n+dchoose d$ feature space
This seems confusing to me. Any kind of help would be appreciated. Thanks.
Edit: Here is the link to the pdf.
Let's look at your $phi$
$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_1, x_2^2, x_2x_3, x_3x_1, x_3x_2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$
You are counting some dimensions twice: $x_1x_2=x_2x_1$, $x_1x_3=x_3x_1$, $x_2x_3=x_3x_2$. Taking the duplicates out results in $d=10$:
$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_3, x_2^2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$
Correct answer by Firebug on November 30, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP