Mathematics Asked on November 16, 2021
Consider the three matrices $mathbf{C}$, $mathbf{A}$, and $mathbf{T}$. The matrix $mathbf{C}$ has $mathit{m} times mathit{k}$ entries, $mathbf{A}$ is a $mathit{k} times mathit{n}$ matrix, and $mathbf{T}$ is a $mathit{m} times mathit{n}$ matrix.
I’d like to evaluate the following matrix derivative:
$$frac{partial}{partialmathbf{C}}bigl( (mathbf{C}mathbf{A}) circ mathbf{T} bigr)$$
Where $circ$ represents the Hadamard product. Note that the dimensions of this expression are consistent since $mathbf{CA}$ is a $mathit{m} times mathit{n}$ matrix. Note that both $mathbf{A}$ and $mathbf{T}$ are both constant matrices with respect to $mathbf{C}$.
I’m wondering how I can evaluate and then express this result. I know that since the expression I am taking the derivative of is a $mathit{m}$ $times$ $mathit{n}$ matrix, and $mathbf{C}$ is a $mathit{m} times mathit{k}$ matrix, that the result of this derivative expression will have $mathit{m} times mathit{n} times mathit{m} times mathit{k}$ entries.
I’d appreciate any answer, including one in index notation.
Thank you for your time.
It's probably simpler to vectorize the matrix equation, and then to eliminate the
Hadamard product in favor of multiplication by a diagonal matrix, i.e.
$$eqalign{
F &= Tcirc CA \
{rm vec}(F) &= {rm vec}(T)circ {rm vec}(CA) \
&= {rm Diag}big({rm vec}(T)big),(A^Totimes I),{rm vec}(C) \
frac{partial f}{partial c}
=
frac{partial,{rm vec}(F)}{partial,{rm vec}(C)}
&= {rm Diag}big({rm vec}(T)big),(A^Totimes I) \
}$$
If you really want a fourth order tensor, there is a straightforward one-to-one mapping between the matrix and its vectorized form, e.g.
$$eqalign{
F &in {mathbb R}^{mtimes n} quadiffquad
f in {mathbb R}^{mntimes 1} \
F_{ij} &= f_{alpha} \
alpha &= i+(j-1),m \
i &= 1+(alpha-1),{rm mod},m \
j &= 1+(alpha-1),{rm div},m \
}$$
So you can convert the gradient matrix into a tensor
$$eqalign{
frac{partial f_alpha}{partial c_beta}
= frac{partial F_{ij}}{partial C_{kell}} \
}$$
Answered by greg on November 16, 2021
Because $C mapsto (CA) circ T$ is a linear map, it is very easy to compute the derivative in differential form. In particular, we have $$ D_C(C_0)(dC) = (dC,A) circ T. $$ Now, let $E_{ij}$ denote the matrix with a $1$ in the $i,j$ entry and zeros elsewhere. Let $e_i$ denote the column vector with a $1$ in the $i$ entry and zeros elsewhere. We have $$ frac{partial f}{partial C_{ij}}|_{C = C_0} = D_C(C_0)(E_{ij}) = (E_{ij}A) circ T = (e_ie_j^top A circ T) = e_i(A^top e_j)^top circ T\ = E_{ii} T operatorname{diag}(A^top e_j). $$ In index notation, one might write $$ frac{partial f_{pq}}{partial C_{ij}} = e_p^TE_{ii} T operatorname{diag}(A^top e_j)e_q = delta_{ip} e_i^top T A_{jq}e_q = delta_{ip} A_{jq}T_{iq}, $$ with no summation implied.
Answered by Ben Grossmann on November 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP