# Different formulations of within-class scatter matrix

Cross Validated Asked on December 15, 2020

If we have a dataset $$X= {x_1,x_2,….,x_n}$$ where all the datapoints are in $$d-$$dimensional feature space and there are $$2$$ classes $$c_1$$ and $$c_2$$ for which $$n_1$$ points from $$X$$ are for class $$c_1$$ and rest are for class $$c_2$$. $$n_1$$ points are also for those $$y_i$$ for which $$y_i=v^Tx_i$$ for some vector $$v$$ and class label of $$x_i$$ is $$c_1$$ and rest belongs to class $$c_2$$ means we have $$n_1+n_2 = n$$.
$$m_1$$ is the mean-vector of class $$c_1$$ and $$m_2$$ is the mean-vector of class $$c_2$$. $$S_1$$ and $$S_2$$ are co-variance matrices corresponding to the class $$c_1$$ and $$c_2.$$
Now, in projected space, $$y_i=v^Tx_i$$ for all $$i=1,2,….,n.$$ In this space, $$mu_1$$ is the mean-vector of class $$c_1$$ and $$mu_2$$ is the mean-vector of class $$c_2$$. $$s_1$$ and $$s_2$$ are co-variance matrices corresponding to the class $$c_1$$ and $$c_2.$$

I have to derive $$3$$ things :
$$1)$$ within class scatter is : $$(mu_1 – mu_2)^2 + frac{s_1^2}{n_1} + frac{s_2^2}{n_2}$$
$$2)$$ within class scatter can also be written as: $$frac{1}{n_1n_2}sum_{y_i in class;c_1} sum_{y_j in class;c_2} (y_i – y_j)^2$$
(Here, $$y_i in class;c_1$$ means $$y_i = w^Tx_i$$ and class-label of $$x_i$$ is $$c_1$$ and $$y_j in class;c_2$$ means $$y_j = w^Tx_j$$ and class-label of $$x_j$$ is $$c_2$$)
$$3)$$ Total scatter is : $$frac{s_1^2}{n_1} + frac{s_2^2}{n_2}$$

According to Fisher Linear Discriminant,
A) within class-scatter($$S_w$$) = $$sum_{x_i in c_1}(x_i – m_1)(x_i – m_1)^T$$ + $$sum_{x_i in c_2}(x_i – m_2)(x_i – m_2)^T$$
B) $$mu_1 = v^Tm_1$$ and $$mu_2 = v^Tm_2$$
C) $$(n_1 s_1)^2 = v^T(n_1S_1)v$$ and $$(n_2 s_2)^2 = v^T(n_2S_2)v;$$ where $$n_1S_1+n_2S_2 =S_w$$
D) $$v= S_w^{-1} (m_1 – m_2)$$
E) $$S_1 = sum_{x_i in c_1} (x_i – m_1)(x_i – m_1)^T$$ and $$S_2 = sum_{x_i in c_2} (x_i – m_2)(x_i – m_2)^T$$
Now, for $$1)$$
$$(mu_1 – mu_2)^2 + frac{s_1^2}{n_1} + frac{s_2^2}{n_2} = (v^Tm_1 – v^Tm_2)^2 + frac{v^TS_1v}{n_1^2}+ frac{v^TS_2v}{n_2^2}$$
Now, how to introduce $$x_i$$ here to get the $$S_w$$.
I was manipulating all these things to get the answer but I was not getting it.
Can anyone please give a hint how to get all these derivations. Any help would be appreciated.