TransWikia.com

How to get the maximum likelihood estimate of the categorical distribution parameters using Lagrange optimization?

Data Science Asked by Shashank Kumar on March 7, 2021

Let’s say our data is discrete-valued and belongs to one of $K$ classes.
The underlying probability distribution is assumed to be a categorical/multinoulli distribution given as $p(textbf{x}) = prod_{k = 1}^{K}mu_{k}^{x_{k}}$ where x is a one-hot vector given as $textbf{x} = [x_{1} x_{2} … x_{K}]^{T}$ and $boldsymbol{mu} = [mu_{1} … mu_{K} ]^{T}$ are the parameters.
Suppose $D = {mathbf{x}_{1}, text{ } mathbf{x}_{2}, text{ } … ,text{ }mathbf{x}_{N}}$ is our data.

The log likelihood is:
$log p(D|boldsymbol{mu}) = sum_{k = 1}^{K} m_{k} log{mu_{k}}$
where $m_{k} = sum_{n = 1}^{N} x_{nk}$

To get the MLE solution, we have to solve the following optimization problem:
$max_{boldsymbol{mu}} sum_{k = 1}^{K} m_{k} log{mu_{k}} hskip 1em text{such that} hskip 1em mu_{k} geq 0, hskip 0.5em sum_{k = 1}^{K} mu_{k} = 1$

To solve this we write the following Lagrangian.
$L(boldsymbol{mu}, mathbf{u}, v) = sum_{k = 1}^{K} m_{k} log{mu_{k}} – sum_{k = 1}^{K} u_{k}mu_{k} + vleft( sum_{k = 1}^{K}mu_{k} – 1right)$

The primal problem formulation is then
$boldsymbol{hat{mu}} = inf_{boldsymbol{mu}} sup_{u_{k} geq 0, v} L(boldsymbol{mu}, mathbf{u}, v)$

I have no idea how to proceed further. Have no clue how to solve the primal problem.

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP