# How do we come up with the SVM Kernel giving $n+dchoose d$ feature space?

I was going through the CS229 notes on SVM and Kernel tricks and I came across the following line.

More generally the kernel $$K(x,z)=(xTz+c)^d$$ corresponds to a feature
mapping to an $$n+dchoose d$$ feature space, corresponding to all monomials
that are up to order d. Despite working in this $$O(n^d)$$ dimensional
space, computing $$K(x,z)$$ is of order $$O(n)$$.

Firstly, How exactly does it translate into $$n+dchoose d$$ feature space?
Consider I have $$n = 3$$ and $$d = 2$$, i.e, $$x = [x1, x2, x3], z = [z1, z2, z3]$$

so, a feature map for $$K(x,z) = (xTz + c)^2$$ would look something like this:
$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_1, x_2^2, x_2x_3, x_3x_1, x_3x_2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$

which makes a total of 13 features. But $$3+2choose 2$$ gives me 10. It does not make sense to me.

Secondly,

Despite working in this $$O(n^d)$$ dimensional space

Why does it say $$n^d$$ dimensional space whereas we had feature mapped to 13 dimensions?
Are we then only considering the monomials $$x_{i1}x_{i2}…x_{ip}$$ which make up order d = 2? (i.e, $$x_1^2$$ or $$x_1x_2$$ etc).

If that is the case, then what is this all about?

the kernel $$K(x,z)=(xTz+c)^d$$ corresponds to a feature mapping to an $$n+dchoose d$$ feature space

This seems confusing to me. Any kind of help would be appreciated. Thanks.

Edit: Here is the link to the pdf.

Let's look at your $$phi$$

$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_1, x_2^2, x_2x_3, x_3x_1, x_3x_2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$

You are counting some dimensions twice: $$x_1x_2=x_2x_1$$, $$x_1x_3=x_3x_1$$, $$x_2x_3=x_3x_2$$. Taking the duplicates out results in $$d=10$$:

$$phi(x) = [x_1^2, x_1x_2, x_1x_3, x_2x_3, x_2^2, x_3^2, sqrt{2c}x_1, sqrt{2c}x_2, sqrt{2c}x_3, c]$$

Correct answer by Firebug on November 30, 2020