Mathematics Asked on January 7, 2022

For simplicity, I’ll focus on the bivariate case. Let $(X_1,X_2)$ be a random vector that obeys bivariate Bernoulli. $X_i$ takes either zero or one. The associated pdf can be written as

$$p(x_1,x_2)=p_{11}^{x_1x_2}p_{10}^{x_1(1-x_2)}p_{01}^{(1-x_1)x_2}p_{00}^{(1-x_1)(1-x_2)}.$$

Now, consider a categorical random variable $Y$ that takes four values ${11,10,01,00}$ with probability ${p_{11},p_{10},p_{01},p_{00}}.$

The associated pdf can be written as

$$p(y)=p_{11}^{[y=11]}p_{10}^{[y=10]}p_{01}^{[y=01]}p_{00}^{[y=00]},$$

where $[y=z]=1$ if and only if $y=z$.

So, it looks like any bivariate Bernoulli random vector can be represented using a categorical random variable.

However, if we think about the following multivariate Bernoulli random vector $Z$, the categorical distribution can also be represented using a multivariate Bernoulli.

Let $Z=(Z_1,Z_2,Z_3,Z_4).$ Each $Z_i$ is a Bernoulli variable that takes either zero or one. Z differs from the general multivariate Bernoulli in that only one of the four variables can take value one.

The pdf of this random vector can be written as

$$p(z_1,z_2,z_3,z_4)=p_{1000}^{z_1(1-z_2)(1-z_3)(1-z_4)}p_{0100}^{(1-z_1)z_2(1-z_3)(1-z_4)}p_{0010}^{(1-z_1)(1-z_2)z_3(1-z_4)}p_{0001}^{(1-z_1)(1-z_2)(1-z_3)z_4}.$$

Now, we have a multivariate Bernoulli random vector that represents the categorical variable in the above.

**My question is what is the relationship between the two random variable/vector and their associated distributions?**

**Focusing on the $n=2$ case**

Let me introduce the following probability mass function:
begin{align*}
p(y_1, y_2) = pi_1^{y_1}(1-pi_1)^{1-y_1}pi_2^{y_2}(1-pi_2)^{1-y_2}left(1 + rho frac{(y_1 - pi_1)(y_2 - pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right)
end{align*}
which is known as *Bahadur's model*. You can indeed verify that
begin{align*}
sum_{(y_1, y_2) in {0, 1}^2} p(y_1, y_2) &= 1 \
text{Corr}(Y_1, Y_2) &= rho
end{align*}
There is a bijection between $(p_{11}, p_{10}, p_{01}, p_{00})$ and $(pi_1, pi_2, rho)$ through the relations
begin{align*}
p_{11} &= pi_1pi_2left(1 + rhofrac{(1-pi_1)(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \
p_{10} &= pi_1(1-pi_2)left(1 - rhofrac{(1-pi_1)pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \
p_{01} &= (1-pi_1)pi_2left(1 - rhofrac{pi(1-pi_2)}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right) \
p_{00} &= (1-pi_1)(1-pi_2)left(1 + rhofrac{pi_1pi_2}{sqrt{pi_1(1-pi_1)pi_2(1-pi_2)}}right)
end{align*}
so Bahadur's model is just a parametrization of the bivariate binary model. Now let $rho = -1$ and $pi_1 = 1 - pi_2 = pi$. This gives
begin{align*}
p_{11} &= 0 \
p_{10} &= pi\
p_{00} &= 0 \
p_{01} &= 1-pi
end{align*}
So, the two-category categorical model is just a special case of Bahadur's model when the correlation is $rho = -1$. This makes sense; a categorical random variable is basically a multivariate binary with hugely negative correlations among the entries to force only one selected category. We use this to generalize the result.

**Generalizing the result**

Bahadur's model can be expanded to multivariate binary random variables $p(y_1, cdots, y_n)$ with the representation begin{align*} p(y_1, cdots, y_n) = left[prod_{i=1}^npi_i^{y_i}(1-pi_i)^{1-y_i}right]left(1 + sum_{k=2}^{n}rho_ktext{Sym}_k(mathbf{r}_n)right) end{align*} where begin{align*} r_i &= frac{y_i - pi_i}{sqrt{pi_i(1-pi_i)}} \ mathbf{r}_n &= (r_1, cdots, r_n) \ text{Sym}_k(mathbf{r}_n) &= sum_{i_1<cdots<i_k}r_{i_1}cdots r_{i_k} end{align*} I'm not entirely sure what choice of the parameters can lead to a genuine categorical random variable (will think about this and post if I have a positive result), but this is a starting place.

Answered by Tom Chen on January 7, 2022

Get help from others!

Recent Answers

- Joshua Engel on Why fry rice before boiling?
- Lex on Does Google Analytics track 404 page responses as valid page views?
- Peter Machado on Why fry rice before boiling?
- haakon.io on Why fry rice before boiling?
- Jon Church on Why fry rice before boiling?

Recent Questions

- How can I transform graph image into a tikzpicture LaTeX code?
- How Do I Get The Ifruit App Off Of Gta 5 / Grand Theft Auto 5
- Iv’e designed a space elevator using a series of lasers. do you know anybody i could submit the designs too that could manufacture the concept and put it to use
- Need help finding a book. Female OP protagonist, magic
- Why is the WWF pending games (“Your turn”) area replaced w/ a column of “Bonus & Reward”gift boxes?

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP