Computational Science Asked by Vittore Scolari on May 28, 2021
I need to optimize a code where the most performance critical part is doing a ‘change of basis’, in other words it is an unitary similarity transformation on a big real positive definite symmetric matrix real matrix. This consists in the following operation: $U^T A U$, with $A$ real positive definite symmetric matrix and $U$ real unitary.
At the moment, I am achieving this using BLAS DGEMM two times. But I am not very satisfied since this is ignoring that the left hand of $A$ is equal to the transpose of the right hand of $A$. Also it is ignoring all nice $A$ properties.
Looking at all LAPACK routines that do unitary similarity transformations, not a single one seems to actually use DGEMM, am I missing a simple optimization opportunity?
For the most part, LAPACK represents unitary matrices like $mathbf U$ as products of householder reflectors, and provides specialized routines to work with that representation (for instance, use [dormqr] to multiply by $mathbf U$, or [dorgqr] to explicitly tabulate $mathbf U$ into a dense matrix).
If you already have $mathbf U$ in householder format, it would be more idiomatic to use [dormqr] to update A, instead of [dgemm]. In particular, two calls to [dormqr] can update $mathbf B = mathbf U^T mathbf A mathbf U$ in place (overwriting $mathbf A$ with $mathbf B$), while [dgemm] will require a temporary matrix. This point might be somewhat academic, though, as [dormqr] requires additional workspace while [dgemm] does not. Although there is a workspace-free routine [dorm2r] that performs a similar function as [dormqr], it's not recommended because it's a BLAS2 algorithm and won't be as fast.
I suppose if I already was just given $mathbf U$ explicitly tabulated, I'd probably just stick with [dgemm]. But if you have control over how $mathbf U$ is generated/computed (for instance, some hand-rolled gram-schmidt procedure applied to some other basis set), I'd strongly consider refactoring everything to use the LAPACK tooling instead (that is use [dgeqrf] to orthogonalize your basis into $mathbf U$ and then use [dormqr] to apply it, there might not be any need to ever form it explicitly).
Out of curiousity, what do you do with $mathbf B$ once you have it? There might be more refactoring/optimization opportunities if you can share more about the larger process.
Answered by rchilton1980 on May 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP