Physics Asked by Gauss57 on July 4, 2021
I’m trying to wrap my head around section 2.2.6 (on variational principles) in the following paper (on the inverse Ising problem): https://arxiv.org/abs/1702.01522
Here the authors explain how to use variational principles to approximate the following thermodynamic potentials:
$$ F(boldsymbol{J}, vec{h}) = – ln{Z(boldsymbol{J}, vec{h})}, $$
$$ S(boldsymbol{chi}, vec{m}) = min_{boldsymbol{J}, vec{h}}left{{-sum_{i} h_i m_i -sum_{i < j} J_{ij} chi_{ij} – F(boldsymbol{J}, vec{h}) }right}, $$
$$G(boldsymbol{J}, vec{m}) = max_{vec{h}}left{ { sum_{i} h_i m_i + F(boldsymbol{J}, vec{h})} right}. $$
I understand that $F(boldsymbol{J}, vec{h})$ can be obtained by noting that for a distribution $q$,
$$D_{KL}(q||p) = langle H rangle_q + langle ln q rangle_q + ln Z(boldsymbol{J}, vec{h}),$$
so that, using the notation $U[q] equiv langle H rangle_q$, and $S[q] equiv langle ln q rangle_q$, we have:
$$F(boldsymbol{J}, vec{h}) = langle H rangle_q + langle ln q rangle_q – D_{KL}(q||p) longrightarrow F(boldsymbol{J}, vec{h}) = min_q{ U[q] – S[q]}.$$
However, the authors then go on to say that it is easy to see (using Lagrange multipliers) that the remaining two potentials can be expressed as:
$$G(boldsymbol{J}, vec{m}) = max_{vec{h}}left{ { sum_{i} h_i m_i + min_q{ U[q] – S[q]}} right} = min_{q in mathcal{G}}left{-sum_{i < j} J_{ij} langlesigma_i sigma_j rangle_q – S[q]right}, $$
and
$$S(boldsymbol{chi}, vec{m}) = min_{boldsymbol{J}, vec{h}}left{{-sum_{i} h_i m_i -sum_{i < j} J_{ij} chi_{ij} – min_q{ U[q] – S[q]} }right} = max_{q in mathcal{S}}{S[q]}.$$
Where $mathcal{G} equiv {q : langle sigma_i rangle_q = m_i},$ and $mathcal{S} equiv {q : langle sigma_i rangle_q = m_i & langle sigma_i sigma_j rangle_q = chi_{ij} }$.
My question is how do I, via Lagrange multipliers, go from these double extremum expressions to the single extremum expressions. Thanks in advance for any help I may recieve.
I think I understand how to get the correct expressions now, however, I did not use Lagrange multipliers:
We want to show that
$$S(boldsymbol{chi}, vec{m}) equiv min_{boldsymbol{J}, vec{h}}left{{-sum_{i} h_i m_i -sum_{i < j} J_{ij} chi_{ij} - min_q{ U[q] - S[q]} }right},$$ can be expressed as $$S(boldsymbol{chi}, vec{m}) = max_{q in mathcal{S}}{S[q]}.$$
Note first that $-min_q {U[q] - S[q]} = max_q{-U[q] + S[q]}$. Let
$$S(boldsymbol{chi}, vec{m} ; boldsymbol{J}, vec{h}; q) = - sum_i h_i m_i -sum_{i < j} J_{ij} chi_{ij} - U[q] + S[q],$$ then, since $$U[q] equiv langle H rangle_q = langle -sum_i h_i sigma_i - sum_{i < j} J_{ij} sigma_i sigma_j rangle_q = -sum_i h_i langle sigma_i rangle_q - sum_{i < j} J_{ij} langle sigma_i sigma_j rangle_q,$$ if we restrict the distributions $q$ to distributions $$q in mathcal{S} equiv { q : langle sigma_i rangle_q = m_i text{ and } langle sigma_i sigma_j rangle_q = chi_{ij} },$$ we obtain begin{align*} S(boldsymbol{chi}, vec{m} ; boldsymbol{J}, vec{h}; q) &= - sum_i h_i m_i -sum_{i < j} J_{ij} chi_{ij} - Big( -sum_i h_i langle sigma_i rangle_q - sum_{i < j} J_{ij} langle sigma_i sigma_j rangle_q Big) + S[q] &= Big( - sum_i h_i m_i -sum_{i < j} J_{ij} chi_{ij} +sum_i h_i m_i + sum_{i < j} J_{ij} chi_{ij} Big) + S[q] &= S[q]. quad (q in mathcal{S}) end{align*} Thus, $S(boldsymbol{chi}, vec{m} ; boldsymbol{J}, vec{h}; q) = S(boldsymbol{chi}, vec{m}; q)$, i.e., there is no dependecy on $boldsymbol{J}$ and $vec{h}$. To get $S(boldsymbol{chi}, vec{m})$ then, we simply re-introduce the maximization over $q$ (with our condition that $q in mathcal{S}$):
$$S(boldsymbol{chi}, vec{m}) = max_{q in mathcal{S}} { S[q]}.$$
A similar argument yields the correct expression for $G(boldsymbol{J}, vec{m})$.
Correct answer by Gauss57 on July 4, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP