MATH/APPM 4650
Class notes, April 21
Section 6.2: Pivoting methods


Recall the basic algorithm for solving a matrix equation (3x3 for illustrative purposes):
/ a11 a12 a13 \ / x1 \ / b1 \
| a21 a22 a23 | | x2 | = | b2 |
\ a31 a32 a33 / \ x3 / \ b3 /


Given a[i,j] and b[i], solve for x[i].

% First make the matrix upper-triangular.
for k from 2 to n do
   for i from k to n do
      m = a[i,k-1] / a[k-1,k-1]
      for j from k to n do
         a[i,j] = a[i,j] - m*a[k-1,j]
      end do
      b[i] = b[i] - m*b[k-1]
   end do
end do

% Now solve for the x[i].
x[n] = b[n] / a[n,n]
for i from 1 to n-1 do
   x[n-i] = b[n-i]
   for j from n-i+1 to n do
      x[n-i] = x[n-i] - a[n-i,j]*x[j]
   end do
   x[n-i] = x[n-i] / a[n-i,n-i]
end do


Of course, in the second set of loops, one can do a backward "for" loop if one's program allows it (which all of our software does).

We counted the number of multiplications and divisions when solving for the matrix:
Similarly we can count the number of additions and subtractions in solving for the matrix, which is
Σk=2n (n-k+1)(n-k+2) = (n3 - n)/3.

We also want to count the number of multiplications and divisions in solving for x:
Σi=1n-1 i + n = (n2+n)/2.

And the number of additions and subtractions in solving for x:
Σi=1n-1 i = (n2-n)/2.

Observe:
The number of computations in solving for x is smaller than the number in solving for the matrix:
O(n2) to solve for x, vs. O(n3) to solve for the matrix.

This is easy to understand, since there are three loops in the matrix computation and only two loops in the x computation.

The important lesson is that even if we have O(n2) extra computations to do (to make a better algorithm), this doesn't substantially affect the overall length of the algorithm.



Partial pivoting
Now we have to worry about how the algorithm breaks down.
There are two things that can go wrong: one theoretical, and one practical.

Theoretical: if at any step we end up with an a[i,i] which is zero, and we try dividing by it, the program will break. This may happen even if the matrix has solutions.
Practical: if at any step we end up with an a[i,i] which is small, and we try dividing by it, roundoff error will be greatly magnified.

We solve both problems in the same way: by pivoting.

Consider the following example from the book:
/ 0.003000 59.14 \ / x1 \ / 59.17 \
\      5.291 -6.130 / \ x2 / = \ 46.78 /


Solving this exactly, it's clear that x1 = 10 and x2 = 1 is the only solution.

Solving it numerically using the algorithm above and four-digit rounding, we get
  1. m = 1764
  2. a[2,2] = -6.130 - (1764)*(59.14) = -6.130 - 104300 = -104300
  3. b[2] = 46.78 - (1764)*(59.17) = 46.78 - 104400 = -104400
  4. x[2] = (-104400)/(-104300) = 1.001
  5. x[1] = (59.17 - 59.14*1.001)/(0.003000) = (59.17 - 59.20)/(0.003000) = -0.03000/0.003000 = -10.00
x[2] is pretty close to the correct value (1.001 vs. 1.000), while x[1] is way the hell off (-10.00 vs. 10.00).

Observe that the main problem is dividing by 0.003000 (twice).
This doesn't cause much trouble with m (it's 1764 vs. 1763.667), but it causes a lot of trouble in the very last step, because it greatly magnifies all small roundoff errors.

Lesson: We want to rearrange the rows at every k-step so that we're dividing by the largest possible diagonal element.

So we will add the following steps into the basic algorithm.
  1. At every k, check the values a[k,k], a[k+1,k], ..., a[n,k] and see which one is the largest in absolute value. Let's say a[p,k] is the largest.
  2. Then we'll switch the entire row a[k,k], a[k,k+1], ..., a[k,n] | b[k] with the row a[p,k], a[p,k+1], ..., a[p,n] | b[p].
Algorithmically, this will look like the following (this whole thing gets stuck in immediately after for k from 2 to n do:

% R is the maximum size of a component in the current column
R = abs(a[k,k])
% p is the row where this maximum occurs
p = k
for m from k+1 to n do
   S = abs(a[m,k])
   if (S > R) then
      p = m
      R = S
   end if
end do
% Now we've found the maximum which occurs in row p, and we have to switch the rows (unless k=p).
if (k≠p) then
   for j from k to n do
      oldster = a[k,j]
      a[k,j] = a[p,j]
      a[p,j] = oldster
   end do
   oldster = b[k]
   b[k] = b[p]
   b[p] = oldster
end if


This looks like a lot of operations, but it's actually not, relatively speaking.
Notice that in the maximum-finder, we have something which is at most about 4n operations.
In the row-switcher, we have something which is at most about 3n operations.
We do each of these for every k, so the maximum-finder runs at most about 4n2 operations, while the row-switcher runs at most 3n2 operations.
So we have only added something of order O(n2) to the algorithm, which for large n, has hardly any effect (since the whole thing is still O(n3).


The final bit is to check on the rare situation where the maximum size R of some column is zero (which means there is no solution).
Keep in mind that this may occur after we have already done some computations, which means we may not actually see "zero."
Instead we might see R = 0.0001 or something, which means there was probably no solution, but we can't quite tell because of roundoff error.
So we should specify some tolerance (which will depend on the number of significant digits we're given in the data, the number of significant digits we're using to compute everything, and the number of significant digits we want for the solution at the end).

Hence the true algorithm, which is called "Gaussian elimination with partial pivoting," is


% Given all a[i,j], all b[i], and a tolerance tol.
% First make the matrix upper-triangular.
for k from 2 to n do
   % First we do the partial pivoting.
   % R is the maximum size of a component in the current column
   R = abs(a[k,k])
   % p is the row where this maximum occurs
   p = k
   for m from k+1 to n do
      S = abs(a[m,k])
      if (S > R) then
         p = m
         R = S
      end if
   end do

   % Check that R is not too small; if it is, the matrix is probably singular.
   if (R < tol) then
      print "You're crazy. I'm not solving this. This is crap."
      print "Singularity found in column ", k
      BREAK OUT OF ALL LOOPS
   end if

   % Now we've found the maximum which occurs in row p, and we have to switch the rows (unless k=p).
   if (k≠p) then
      for j from k to n do
         oldster = a[k,j]
         a[k,j] = a[p,j]
         a[p,j] = oldster
      end do
      oldster = b[k]
      b[k] = b[p]
      b[p] = oldster
   end if

   % Partial pivoting is done. Now we actually reduce the matrix.
   for i from k to n do
      m = a[i,k-1] / a[k-1,k-1]
      for j from k to n do
         a[i,j] = a[i,j] - m*a[k-1,j]
      end do
      b[i] = b[i] - m*b[k-1]
   end do
end do

% Observe that we never checked the final value of a[n,n] to see if it's too small. Let's do it now.
if (abs(a[n,n]) < tol) then
   print "You're crazy. I'm not solving this. This is crap."
   print "Singularity found in last column."
   BREAK OUT OF ALL LOOPS
end if

% Now solve for the x[i].
x[n] = b[n] / a[n,n]
for i from 1 to n-1 do
   x[n-i] = b[n-i]
   for j from n-i+1 to n do
      x[n-i] = x[n-i] - a[n-i,j]*x[j]
   end do
   x[n-i] = x[n-i] / a[n-i,n-i]
end do




Scaled partial pivoting
What if someone is sneaky and gives your program the following system (also from the book)?
/ 30.00 591400 \ / x1 \ / 591700 \
\ 5.291 -6.130 / \ x2 / = \ 46.78 /
This is exactly the same system as before, but the first row has been multiplied by 10,000.
When you solve it, you'll have exactly the same problem.
However, the partial pivoting algorithm won't catch this.

If you're worried about this, you can use scaled partial pivoting.

The algorithm is the same, but it checks the relative sizes of elements in a row (in other words, it rescales all the equations so the largest coefficient is one, then applies partial pivoting).

% R is the maximum SCALED size of a component in the current column
% First find the maximum absolute value of elements in each row (call the max of the i row scale[i])
for i from k to n do
   scale[i] = abs(a[i,k])
   for j from k+1 to n do
      Q = abs(a[i,j])
      if (Q > scale[i]) then
         scale[i] = Q
      end if
   end do
end do
R = abs(a[k,k])/scale[k]
% p is the row where the scaled maximum occurs
p = k
for m from k+1 to n do
   S = abs(a[m,k])/scale[m]
   if (S > R) then
      p = m
      R = S
   end if
end do


Notice that this requires O(n3) steps. So you're actually making the Gaussian elimination algorithm even worse (contributing an extra n3 factor to it).
That's not good at all.


You can instead just determine all the scale factors at the beginning and never worry about how they change (then it's just an O(n2) operation).
This is what people usually do. Scale factors usually won't change by a huge amount even after some eliminations, so this does well enough.

So if you're worried about scaling but you don't want to sacrifice efficiency, you can use the following. Put it before you start the k-loop, so you'll only be adding O(n2) computations.
% R is the maximum SCALED size of a component in the current column
% First find the maximum absolute value of elements in each row (call the max of the i row scale[i])
for i from 1 to n do
   scale[i] = abs(a[i,1])
   for j from 2 to n do
      Q = abs(a[i,j])
      if (Q > scale[i]) then
         scale[i] = Q
      end if
   end do
end do


Now that you have these scale factors, you can put this next bit after you start the k-loop.
R = abs(a[k,k])/scale[k]
% p is the row where the scaled maximum occurs
p = k
for m from k+1 to n do
   S = abs(a[m,k])/scale[m]
   if (S > R) then
      p = m
      R = S
   end if
end do




But if you want to get really hardcore about your scaling...

You can do complete pivoting.

This means that you find the largest absolute value of all the entries in the k-to-n by k-to-n submatrix, and switch both the rows and the columns.
This ensures that you're always dividing by the largest possible diagonal element.
But it's really slow (again, it adds an extra factor proportional to n3 to the computation time).

You can almost always do well with partial pivoting, unless you have some reason to believe your matrix is really bad.