MATH/APPM 4650
Class notes, April 25
Section 6.6: You're so very special. I wish I was special.


Gaussian elimination is the best algorithm for general matrices, about which we know nothing.
For special types of matrices, we can do better.

When you're trying to solve a linear algebra problem, see if the matrix you're dealing with has any special structure.

Most matrices that arise naturally in solving other problems numerically do have special structure.

The structures we will discuss today are:




Strictly diagonally dominant matrices

A typical diagonally dominant matrix is
/ 5 -1 2 \
| -1 7 5 |
\ 2 -2 6 /
Definition: In each row, the sum of absolute values of all nondiagonal elements is strictly less than the absolute value of the diagonal element.
Here we have 1+2<5, 1+5<7, and 2+2<6.

For example, the matrix used for finding cubic spline coefficients is always strictly diagonally dominant.

Theorem: (Proof on board, or see book.)

Consequence: A diagonally dominant matrix can always be written A = LU, where L has 1s on the diagonal.




Symmetric matrices

Definition: A matrix is symmetric if aij = aji for all i and j.

From linear algebra: A symmetric matrix always has real eigenvalues, and an orthonormal basis of eigenvectors.

Therefore we can in principle always write a symmetric matrix in the form
A = P D PT
where P is an orthogonal matrix (i.e., PPT = identity) and D is diagonal.

However, to do this, we need to find the eigenvalues and eigenvectors. This requires finding the characteristic polynomial (determinants!), solving it (Newton's method), then solving n different matrix equations (with n different matrices!) to find the eigenvectors. Long, hard, and painful.

However, there's a much easier factorization which is just as useful.
A = L D LT
where D is diagonal and L is lower-triangular with 1s on the diagonal.

Notice: the entries of D are probably not the eigenvalues of A.

It's easy to see how this works in the 2x2 case. (On board.)

In general, the algorithm looks like this.

% Given a symmetric matrix A (we only need the lower triangular entries), find the factorization A = LDL^T.
for i from 1 to n do
   D[i] = a[i,i]
   for j from 1 to i-1 do
      v[j] = L[i,j] * D[j]
      D[i] = D[i] - L[i,j] * v[j]
   end do
   for j from i+1 to n do
      L[j,i] = a[j,i]
      for k from 1 to i-1 do
         L[j,i] = L[j,i] - L[j,k] * v[k]
      end do
      L[j,i] = L[j,i] / D[i]
   end do
end do





Positive-definite matrices

Here is a typical positive definite matrix:
/ 5 -6 -1 \
| -6 26 3 |
\ -1 3 6 /
Definition: A matrix is positive-definite if it is symmetric and xTAx > 0 for every vector x.

One way to see that the matrix above is positive-definite is to compute its eigenvalues.
The eigenvalues of this matrix are about 28.04, 3.39, and 5.57.
(We know they'd always be real, since the matrix is symmetric, but they may not always be positive.)

In an orthonormal eigenvector basis e1, e2, e3, we can write a general vector as x = p e1 + q e2 + r e3.
If the matrix has eigenvalues λ, μ, and ν, then
xTAx = λ p2 + μ q2 + ν r2.

This proves that
a symmetric matrix is positive-definite if and only if all its eigenvalues are strictly positive.

Observe that a positive-definite matrix does not necessarily have all positive entries.
Also observe that a matrix which does have all positive entries may not be positive definite.
Consider for example
/ 1 1 \
\ 1 1 /
Eigenvalues are 0 and 2, so not positive-definite.

Usually it's a pain to compute eigenvalues (they involve determinants!).
So we want some idea of when a matrix is positive-definite.

We're not going to get it in this class.

But at least here are some simple properties of positive-definite matrices. (A matrix might satisfy all these properties but still not be positive-definite!)

Theorem: Suppose A is positive-definite. Then also,

If a matrix A is symmetric and positive definite, then when we factor A = LDLT, the entries of D will all be positive.

Therefore, we can write D = R2 where S is the diagonal matrix of square roots. Since S = ST, we get
A = (LS)(STLT) = (LS)(LS)T
Now if L is lower-triangular and S is diagonal, then LS is also lower triangular. So we get a factorization
A = L'(L')T
where L' is a lower triangular matrices (although probably not with 1s on the diagonal).

The algorithm to get this factorization directly just involves a slight modification of the factorization algorithm for symmetric matrices.
Taking a couple of shortcuts, it becomes Cholesky's algorithm.

% Given a positive-definite (symmetric) matrix A (we only need the lower diagonals), find L so that A = LL^T.
L[1,1] = sqrt(a[1,1])
for j from 2 to n do
   L[j,1] = a[j,1] / L[1,1]
end do
for i from 2 to n-1 do
   L[i,i] = a[i,i]
   for k from 1 to i-1 do
      L[i,i] = L[i,i] - L[i,k]^2
   end do
   L[i,i] = sqrt(L[i,i])
   for j from i+1 to n do
      L[j,i] = a[j,i]
      for k from 1 to i-1 do
         L[j,i] = L[j,i] - L[j,k]*L[i,k]
      end do
      L[j,i] = L[j,i] / L[i,i]
   end do
end do
L[n,n] = a[n,n]
for k from 1 to n-1 do
   L[n,n] = L[n,n] - L[n,k]^2
end do
L[n,n] = sqrt(L[n,n])


Counting up the number of computations, we get roughly n3/6 for both LDLT factorization and Cholesky's factorization.
Compare to Gaussian elimination, which is roughly n3/3 computations.
Using symmetry cuts the number of computations roughly in half!




Band/tridiagonal matrices

We can greatly reduce the number of computations if we happen to know that many entries in the matrix are zero.

The most helpful way to have zeros is when they are all far from the diagonal.

Diagonal matrices are obviously the simplest. The next simplest is tridiagonal: nonzero on the diagonal and one diagonal above and below, zeros elsewhere.

A typical example is the following.
/ 2 -1 0 0 \
| -1 2 -1 0 |
| 0 -1 2 -1 |
\ 0 0 -1 2 /


In general, we make the matrix upper-triangular using the algorithm
ak+1,k+1 = ak+1,k+1 - ak+1,kak,k+1/akk,
bk+1 = bk+1 - ak+1,kbk/akk
and then solve the remaining thing using the standard method:
xn = bn/ann,     xk = (bk - ak,k+1xk+1)/akk


Observe: one loop for the first bit, one loop for the second bit. So the whole algorithm is O(n)! Super-fast! It's called Crout's algorithm.

for k from 1 to n-1 do
   m = a[k+1,k] / a[k,k]
   a[k+1,k+1] = a[k+1,k+1] - m * a[k,k+1]
   b[k+1] = b[k+1] - m * b[k]
end do
x[n] = b[n] / a[n,n]
for j from 1 to n-1 do
   x[n-j] = (b[n-j] - a[n-j,n-j+1]*x[n-j+1]) / a[n-j,n-j]
end do


Number of multiplications: 5n-4. Number of additions: 3n-3.

More generally, a band matrix may have some fixed number of off-diagonals nonzero.
The bandwidth of a matrix is the number of diagonals that aren't zero.
A tridiagonal matrix has bandwidth 3.

Notice that
/ 1 0 0 1 0 0 \
| 0 1 0 0 1 0 |
| 0 0 1 0 0 1 |
| 1 0 0 1 0 0 |
| 0 1 0 0 1 0 |
\ 0 0 1 0 0 1 /
does not have bandwidth two. It has bandwidth four!
Intermediate diagonals count, even if they are zeros.