Calculating the log-determinant of a large sparse matrix

Question

I need to calculate $log(det (mathbf M_i))$ where the $mathbf M_i$'s  are large sparse matrices, which are real, symmetric and positive semi-definite. I hope to have between $10$ and $100$ of those. Physically, each $mathbf M_i$ is the stiffness matrix of a large system, and if it helps, I know how to write it as $mathbf M_i=mathbf B_i^Tmathbf B_i$ (where $mathbf B_i$ is not a square matrix).

I am not a computational scientist, so I would prefer using written packages rather than implementing my own algorithms. I currently work in MATLAB and/or Mathematica, but I am also fluent in C++, and am willing to work in any other enviroment if this could help.

What I've tried so far is calculating the eigenvalues (using MATLAB's eig) and summing up their logs, but I suspect that there are other, more efficient ways to do so. For example, I stumbled upon this paper, but I am not sure whether its algorithm is suitable in my case, and I could not find an implemented code.

Another complication is that it might happen that $mathbf M_i$ is not invertible, i.e. $detmathbf M_i=0$, and in this case I'm interested in the product of all the non-zero eigenvalues (in these cases I know exactly how many zero eigenvalues are there).

Any help would be much appreciated.

Nick Alger · Accepted Answer

Like Brian Borchers says in his comment, the determinant can be determined from the diagonal entries of the Cholesky factor.

Theory:

Consider the Cholesky factorization 
$$R^TR = S^TMS,$$ 
where

$R$ is an upper triangular matrix, and
$S$ is a permutation matrix.

Then 
begin{align}
text{det}(M) &= text{det}(SR^TRS^T) 
&= text{det}(S) text{det}(R^T) text{det}(R) text{det}(S^T) 
&= text{det}(R)^2.
end{align}

Since $R$ is triangular, its eigenvalues are the diagonal entries, hence
begin{align}
text{log}~text{det}(M) &= 2 text{log}~text{det}(R) 
&= 2 sum_i text{log}(R_{ii}).
end{align}

Code:

Matlab code to do this is, for example,

[R,~,S] = chol(M); 
log_det_chol = sum(2*log(diag(R)));

In python, the log determinant can be computed from the sparse cholesky factorization using the package scikits.sparse.cholmod. Paraphrasing from the linked page, the code to do this is:

from scikits.sparse.cholmod import cholesky
R = cholesky(M)
log_det_chol = R.logdet()

If you only want to consider the nonzero eigenvalues, you can just sum over the nonzero entries of the diagonal of $R$.

Test case:

I tested this against Matlab's build in det() function on the 2D finite difference Laplace operator with Dirichlet B.C.'s, and got exact agreement for matrices up to around 400-by-400, after which point using matlab's det() yielded Inf, whereas the cholesky method had no problem going up to million-by-million scale on my laptop. Here is the code, for reference:

n = 20; % M is n^2-by-n^2. Matlab's det() returns Inf for n=25+
M = gallery('poisson',n); 
[R,~,S] = chol(M); 
log_det_chol = sum(2*log(diag(R))), 
log_det = log(det(M))

Memory considerations:

Since this method is based on the Cholesky factorization, it inherits the speed and stability from that.

For large problems, the bottleneck using this method is probably going to be memory. In particular, using a Cholesky factorization with good pivoting (Matlab's is good), you can expect to need $O(s^2)$ memory, where $s$ is the number of degrees of freedom in the smallest "separator front" that partitions the degrees of freedom into distinct pieces.

For example, in a 2D problem with an n-by-n grid, the 1D separator front partitioning the domain into two pieces has $s = O(n)$ degrees of freedom. Thus amount of memory required is $O(n^2)$, the same order of magnitude as storing the right hand side.

A 3D n-by-n-by-n problem has a 2D separator front with $s = O(n^2)$ degrees of freedom, necessitating $O(n^4)$ memory.

greggo · Answer

Algorithmically, a fast, stable route to the determinant is to first convert to symmetric tri-diagonal form using Householder reflections (which is the first step that eig would use on a symmetric matrix; this sub-operation may be visible in matlab).

The determinant of a symmetric tri-diagonal matrix can be found by working along the diagonal in a fairly straightforward way. It requires multiplies and adds at each step, though, so if the final value (or intermediate values) are too large or small to be represented without being in log form, you would need to guard the process against over- or underflow.

For the case with zero eigenvalues - I'm not sure offhand what will happen here, if you wind up with 'gaps' in the diagonal (zero diagonal element with 4 adjacent off-diagonal elements also zero') then you can process the submatrices on each side separately. But it might not work out so cleanly. For that case you probably want to just use the 'eig without eigenvectors' op.

Matthew Lewis · Answer

I was fooling around with sparse representation of K_n (tridiagonal -1 2 -1)
which theoretically has a determinant of n+1. I ran across this old post.
Both SuperLU and CHOLMOD don't seem to have the precision to get the correct determinant once you get above n=1e6.
from scipy.sparse import diags
from scipy.sparse.linalg import splu
from sksparse.cholmod import cholesky
from math import exp

n=int(5e6)
K = diags([-1.],-1,shape=(n,n)) + diags([2.],shape=(n,n)) + diags([-1.],1,shape=(n,n))
lu = splu(K.tocsc())
diagL = lu.L.diagonal()
diagU = lu.U.diagonal()
det=diagL.prod()*diagU.prod()
print(det)

factor = cholesky(K.tocsc())
ld = factor.logdet()
print(exp(ld))

4999993.625461911
4999993.625461119

Calculating the log-determinant of a large sparse matrix

3 Answers

Add your own answers!

Ask a Question