# Relationship between multivariate Bernoulli random vector and categorical random variable

Mathematics Asked on January 7, 2022

For simplicity, I’ll focus on the bivariate case. Let $$(X_1,X_2)$$ be a random vector that obeys bivariate Bernoulli. $$X_i$$ takes either zero or one. The associated pdf can be written as
$$p(x_1,x_2)=p_{11}^{x_1x_2}p_{10}^{x_1(1-x_2)}p_{01}^{(1-x_1)x_2}p_{00}^{(1-x_1)(1-x_2)}.$$

Now, consider a categorical random variable $$Y$$ that takes four values $${11,10,01,00}$$ with probability $${p_{11},p_{10},p_{01},p_{00}}.$$

The associated pdf can be written as

$$p(y)=p_{11}^{[y=11]}p_{10}^{[y=10]}p_{01}^{[y=01]}p_{00}^{[y=00]},$$
where $$[y=z]=1$$ if and only if $$y=z$$.

So, it looks like any bivariate Bernoulli random vector can be represented using a categorical random variable.

However, if we think about the following multivariate Bernoulli random vector $$Z$$, the categorical distribution can also be represented using a multivariate Bernoulli.

Let $$Z=(Z_1,Z_2,Z_3,Z_4).$$ Each $$Z_i$$ is a Bernoulli variable that takes either zero or one. Z differs from the general multivariate Bernoulli in that only one of the four variables can take value one.

The pdf of this random vector can be written as

$$p(z_1,z_2,z_3,z_4)=p_{1000}^{z_1(1-z_2)(1-z_3)(1-z_4)}p_{0100}^{(1-z_1)z_2(1-z_3)(1-z_4)}p_{0010}^{(1-z_1)(1-z_2)z_3(1-z_4)}p_{0001}^{(1-z_1)(1-z_2)(1-z_3)z_4}.$$

Now, we have a multivariate Bernoulli random vector that represents the categorical variable in the above.

My question is what is the relationship between the two random variable/vector and their associated distributions?

Focusing on the $$n=2$$ case

Let me introduce the following probability mass function: begin{align*} p(y_1, y_2) = pi_1^{y_1}(1-pi_1)^{1-y_1}pi_2^{y_2}(1-pi_2)^{1-y_2}left(1 + rho frac{(y_1 - pi_1)(y_2 - pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} which is known as Bahadur's model. You can indeed verify that begin{align*} sum_{(y_1, y_2) in {0, 1}^2} p(y_1, y_2) &= 1 \ text{Corr}(Y_1, Y_2) &= rho end{align*} There is a bijection between $$(p_{11}, p_{10}, p_{01}, p_{00})$$ and $$(pi_1, pi_2, rho)$$ through the relations begin{align*} p_{11} &= pi_1pi_2left(1 + rhofrac{(1-pi_1)(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{10} &= pi_1(1-pi_2)left(1 - rhofrac{(1-pi_1)pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{01} &= (1-pi_1)pi_2left(1 - rhofrac{pi(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \ p_{00} &= (1-pi_1)(1-pi_2)left(1 + rhofrac{pi_1pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) end{align*} so Bahadur's model is just a parametrization of the bivariate binary model. Now let $$rho = -1$$ and $$pi_1 = 1 - pi_2 = pi$$. This gives begin{align*} p_{11} &= 0 \ p_{10} &= pi\ p_{00} &= 0 \ p_{01} &= 1-pi end{align*} So, the two-category categorical model is just a special case of Bahadur's model when the correlation is $$rho = -1$$. This makes sense; a categorical random variable is basically a multivariate binary with hugely negative correlations among the entries to force only one selected category. We use this to generalize the result.

Generalizing the result

Bahadur's model can be expanded to multivariate binary random variables $$p(y_1, cdots, y_n)$$ with the representation begin{align*} p(y_1, cdots, y_n) = left[prod_{i=1}^npi_i^{y_i}(1-pi_i)^{1-y_i}right]left(1 + sum_{k=2}^{n}rho_ktext{Sym}_k(mathbf{r}_n)right) end{align*} where begin{align*} r_i &= frac{y_i - pi_i}{sqrt{pi_i(1-pi_i)}} \ mathbf{r}_n &= (r_1, cdots, r_n) \ text{Sym}_k(mathbf{r}_n) &= sum_{i_1 I'm not entirely sure what choice of the parameters can lead to a genuine categorical random variable (will think about this and post if I have a positive result), but this is a starting place.

Answered by Tom Chen on January 7, 2022