TransWikia.com

How do the authors get this updating formula for all $beta$ in $beta$-divergence

Data Science Asked by Akira on August 8, 2021

I’m reading the paper Algorithms for nonnegative matrix factorization with the β-divergence by Cédric Févotte and Jérôme Idier. Package scikit-learn uses their algorithm for module sklearn.decomposition.NMF. In section 4.1, they said

An MM algorithm can be derived by minimizing the auxiliary function $G(mathbf{h} mid tilde{mathbf{h}})$ w.r.t to $mathbf{h}$. Given the convexity and the separability of the auxiliary function the optimum is obtained by canceling the gradient given by Eq. (36). This is trivially done and leads to the following update:
$$
h_{k}^{mathrm{MM}} = tilde{h}_{k}left(frac{sum_{f} w_{f k} v_{f} tilde{v}_{f}^{beta-2}}{sum_{f} w_{f k} tilde{v}_{f}^{beta-1}}right)^{gamma(beta)}.
$$

The gradient in Eq. (36) is

enter image description here

This gradient depends on our choice of the decomposition of $beta$-divergence. I don’t get how the authors obtain such an explicit formula for $h_{k}^{mathrm{MM}}$. Could you please elaborate on this issue?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP