Iterative methods

AMath 585, Winter Quarter 2020 at the University of Washington. Developed by R.J. LeVeque and distributed under the BSD license. You are free to modify and use as you please, with attribution.

These notebooks are all available on Github.


This notebook illustrates some classical iterative methods based on matrix splitting: Jacobi, Gauss-Seidel, and SOR.

In [1]:
%matplotlib inline
In [2]:
from pylab import *

To better illustrate the sequence of approximate solutions obtained as we iterate, we will use a widget to display a sequence of plots in a way that we can easily look through them. Other available widgets are described and illustrated in the ipywidgets documentation.

The function animate_figs defined below invokes an iteractive widget to loop over a set of matplotlib figures that have been generated in some loop and stored as a list called figs:

In [3]:
from ipywidgets import interact
import ipywidgets as widgets

def animate_figs(figs):
    show_frame = lambda frameno: display(figs[frameno])
    interact(show_frame, frameno=widgets.IntSlider(min=0,max=len(figs)-1, value=0))

Define a simple BVP in 1D

For illustration we will solve $u''(x) = f(x)$ in $0\leq x \leq 1$ with Dirichlet boundary conditions.

We choose a cubic function $u(x)$ as the solution so that the truncation error is zero, and hence the exact solution of the linear system should be the exact solution of the ODE at the grid points.

In [4]:
f_fcn = lambda x: 6*x + 2
utrue_fcn = lambda x: x**3 + x**2 - x + 1

This solve_bvp function sets up and solves the linear system by an iterative method that is defined by the argurment update_u, which should be a function to take a single step of some iterative method.

This allows us to easily test different methods by passing in a different update_u function, without repeating all the other stuff.

Note that this is set up to solve the system in which the tridiagonal matrix has been augmented by a row of the identity matrix at the top and bottom with right hand sides values given by the Dirichlet boundary conditions, i.e., the first and last equations in the system are U[0] = alpha and U[m+1] = beta. This is convenient for plotting purposes and also if you want to try modifying the problem to use Neumann BCs at one of the boundaries, for example.

In [5]:
def solve_bvp(f_fcn, utrue_fcn, m, update_u, maxiter, kplot):

    alpha = utrue_fcn(0)
    beta = utrue_fcn(1)

    h = 1./(m+1)
    x = linspace(0,1,m+2)

    utrue = utrue_fcn(x)
    F = f_fcn(x)

    U0 = linspace(alpha, beta, m+2) # initial guess
    U = U0.copy() # current iterate

    max_error = abs(U-utrue).max()
    errors = [max_error]

    figs = []  # for the list of figures we generate
    
    for k in range(1,maxiter+1):
                
        U = update_u(U, F, h)  # take one iteration
        
        max_error = abs(U-utrue).max()
        errors.append(max_error)

        if mod(k,kplot)==0 or k==maxiter:
            # every kplot iterations create a plot:
            fig = figure(figsize=(12,5))
            plot(x,U0,'r-o', label='initial guess')
            plot(x,utrue,'k-o', label='true solution')
            plot(x,U,'bo-', label= 'iteration k = %i' % k)
            legend()
            grid(True)
            xlim(0,1)
            ylim(0, 3)
            title('After %i iterations, max error = %.2e' \
                  % (k, max_error))
            figs.append(fig)
            close(fig)
            
    errors = array(errors)  # convert list to numpy array
    return errors, figs

Jacobi iteration

This function does one update step for Jacobi. Note that the way we have defined the system to include the boundary values, Jacobi does not change the first or last element of U.

In [6]:
def update_u_Jacobi(U, F, h):
    """
    Input: Current iterate U^[k-1]
    Output: Next iterate U^[k]
    """
    m = len(U) - 2
    Uprev = U.copy()  # save current iterate so it's not overwritten
    
    for i in range(1,m+1):
        U[i] = 0.5*(Uprev[i-1] + Uprev[i+1] - h**2 * F[i])
        
    return U
In [7]:
m = 19
h = 1./(m+1)

Test Jacobi

If you adjust this to plot every iteration for say 20 iteration you will see that convergence is very slow. Here we plot every 20 iterations for 400:

In [8]:
errors_Jacobi, figs = solve_bvp(f_fcn, utrue_fcn, m=m, update_u=update_u_Jacobi, 
                                maxiter=400, kplot=20)
In [9]:
animate_figs(figs)

Gauss-Seidel iteration

For Gauss-Seidel we update U[i] using the already-updated U[i-1].

In [10]:
def update_u_GS(U, F, h):
    """
    Input: Current iterate U^[k-1]
    Output: Next iterate U^[k]
    """
    m = len(U) - 2
    for i in range(1,m+1):
        U[i] = 0.5*(U[i-1] + U[i+1] - h**2 * F[i])
    return U
In [11]:
errors_GS, figs = solve_bvp(f_fcn, utrue_fcn, m=m, update_u=update_u_GS, 
                            maxiter=400, kplot=20)
In [12]:
animate_figs(figs)

We observe that Gauss-Seidel converges faster, as expected. If we plot the error vs. k, we can see that both are decaying exponentially, though slowly.

In [13]:
plot(errors_Jacobi, 'r', label='Jacobi')
plot(errors_GS, 'b', label='Gauss-Seidel')
legend();

Since we expect the error to decay like $E(k) \approx C \rho^k$ for some convergence rate $\rho$, the rates are clearer if we make a semilogy plot. Since we expect $\log(E(k)) \approx \log(C) + k\log(\rho)$ we expect the error to decay linearly in such a plot with slope $\log(\rho) < 0$ since $\rho < 1$.

In [14]:
semilogy(errors_Jacobi, 'r', label='Jacobi')
semilogy(errors_GS, 'b', label='Gauss-Seidel')
legend();

Estimating the convergence rate

Since we expect $\log(E(k)) \approx \log(C) + k\log(\rho)$ and the convergence looks so linear, we could fit a straight line through a couple points on these plots and easily estimate $\log(\rho)$.

A more general approach, if the convergence is a bit ragged, would be to do a least squares fit of a linear function to more points (say $n$) from this convergence plot, setting up a linear system using $\log(E_i) = c_1 + c_2 k_i$ for $i=1,~2,~\ldots,~n$,

$$ \left[\begin{array}{cc} 1 & k_1\\ 1 & k_2\\ \vdots & \vdots \\ 1 & k_n \end{array}\right] \left[\begin{array}{c} c_1\\ c_2 \end{array}\right] = \left[\begin{array}{c} \log(E_1)\\ \log(E_2)\\ \vdots\\ \log(E_n) \end{array}\right] $$

and then solving this in the least squares sense for $[c_1,~c_2]$. Then $\rho \approx \exp(c_2)$.

Here we explicitly set up and solve this least square problem using numpy.linalg.lstsq.

In [15]:
def convergence_rate(k_vals, errors):
    n = len(k_vals)
    assert len(errors) == n, 'k_vals and errors must have the same length'
    print('Estimating rate based on %i values' % n)
    B = vstack((ones(n),k_vals)).T
    logE = log(errors)
    
    # solve least square problem:
    c, residuals, rank, s = lstsq(B,logE,rcond=None)
    
    logC = c[0]
    logrho = c[1]
    C = exp(logC)
    rho = exp(logrho)
    print('Convergence approximately like  E(k) = %.3f * rho**k   with rho = %.8f' % (C,rho))
In [16]:
k_vals = arange(50, 200, dtype=int)
rho_Jacobi = convergence_rate(k_vals, errors_Jacobi[k_vals])

# compare with the value expected from (4.16) in the text:
rho_theory = 1 - 0.5*pi**2*h**2
print('Predicted value of rho = %.8f' % rho_theory)
Estimating rate based on 150 values
Convergence approximately like  E(k) = 0.645 * rho**k   with rho = 0.98768885
Predicted value of rho = 0.98766299
In [17]:
k_vals = arange(50, 200, dtype=int)
rho_GS = convergence_rate(k_vals, errors_GS[k_vals])

# compare with the value expected from (4.21) in the text:
rho_theory = 1 - pi**2*h**2
print('Predicted value of rho = %.8f' % rho_theory)
Estimating rate based on 150 values
Convergence approximately like  E(k) = 0.659 * rho**k   with rho = 0.97552826
Predicted value of rho = 0.97532599

SOR method

The successive overrelaxation method computes the G-S update at each point but then moves farther in this direction. The optimal $\omega$ for this one-dimensional Poisson problem is given in Section 4.2.2.

In [18]:
omega_opt = 2 / (1 + sin(pi*h))
print('Optimal omega = %.6f' % omega_opt)
Optimal omega = 1.729454

Here's the update method for SOR. You might want to try changing omega to different values to see how this affects convergence. It should converge as long as $0 < \omega < 2$ but even small changes in $\omega$ will give much poorer results than the optimal. (And recall that $\omega = 1$ corresponds to Gauss-Seidel, so setting $\omega < 1$ would be under-relaxed and converge even slower than G-S.)

In [19]:
omega = omega_opt
#omega = 1.9

def update_u_SOR(U, F, h):
    """
    Input: Current iterate U^[k-1]
    Output: Next iterate U^[k]
    """
    m = len(U) - 2
    for i in range(1,m+1):
        U[i] = 0.5*(U[i-1] + U[i+1] - h**2 * F[i])*omega \
               + (1-omega)*U[i]
    return U
In [20]:
errors_SOR, figs = solve_bvp(f_fcn, utrue_fcn, m=m, update_u=update_u_SOR, 
                             maxiter=400, kplot=20)

animate_figs(figs)
In [21]:
semilogy(errors_Jacobi, 'r', label='Jacobi')
semilogy(errors_GS, 'b', label='Gauss-Seidel')
semilogy(errors_SOR, 'g', label='SOR')
legend();
In [22]:
k_vals = arange(50, 100, dtype=int)
rho_SOR = convergence_rate(k_vals, errors_SOR[k_vals])

rho_theory = omega - 1  # only correct if omega_opt <= omega < 2
print('Predicted value of rho = %.8f' % rho_theory)
Estimating rate based on 50 values
Convergence approximately like  E(k) = 11.667 * rho**k   with rho = 0.73977834
Predicted value of rho = 0.72945382

Try a different $f$ function

Remember that the iterative method is converging (we hope) to the solution of the linear system of equations, and not to the true solution of the differential equation in general. In the example above we chose a problem for which the truncation error is zero and so the two are ths same, but this is not true in general.

Here's a problem for which the numerical approximation to the ODE is not exact, and so if we compute the error as we did above by comparing to the true solution of the ODE it will not go to zero as we take more iterations. It will stop decreasing once the global error of the finite difference method is greater than the error in our approximation to the solution of the linear system:

In [23]:
f_fcn = lambda x: -12*sin(2*x)
utrue_fcn = lambda x: 3*sin(2*x)
In [24]:
errors_SOR, figs = solve_bvp(f_fcn, utrue_fcn, m=m, update_u=update_u_SOR, 
                             maxiter=100, kplot=5)

animate_figs(figs)
In [25]:
semilogy(errors_SOR, 'g', label='SOR');

In this case, to get a better approximation to the ODE solution we would of course have to use a finer grid. Note that for SOR the optimal omega then has to be adjusted as well...

In [26]:
m = 39
h = 1./(m+1)
omega_opt = 2 / (1 + sin(pi*h))
print('Optimal omega = %.6f' % omega_opt)
Optimal omega = 1.854498
In [27]:
omega = omega_opt

def update_u_SOR(U, F, h):
    """
    Input: Current iterate U^[k-1]
    Output: Next iterate U^[k]
    """
    m = len(U) - 2
    for i in range(1,m+1):
        U[i] = 0.5*(U[i-1] + U[i+1] - h**2 * F[i])*omega \
               + (1-omega)*U[i]
    return U
In [28]:
errors_SOR, figs = solve_bvp(f_fcn, utrue_fcn, m=39, update_u=update_u_SOR, 
                             maxiter=200, kplot=10)

animate_figs(figs)
In [29]:
semilogy(errors_SOR, 'g', label='SOR');
In [ ]: