Solving linear equation systems

When dealing with non-linear networks the number of equation systems to be solved depends on the required precision of the solution and the average necessary iterations until the solution is stable. This emphasizes the meaning of the solving procedures choice for different problems.

The equation systems

$\displaystyle \left[A\right] \cdot \left[x\right] = \left[z\right]$ (15.58)

solution can be written as

$\displaystyle \left[x\right] = \left[A\right]^{-1} \cdot \left[z\right]$ (15.59)

Matrix inversion

The elements $ \beta_{\mu\nu}$ of the inverse of the matrix $ A$ are

$\displaystyle \beta_{\mu\nu} = \frac{A_{\mu\nu}}{det A}$ (15.60)

whereas $ A_{\mu\nu}$ is the matrix elements $ a_{\mu\nu}$ cofactor. The cofactor is the sub determinant (i.e. the minor) of the element $ a_{\mu\nu}$ multiplied with $ (-1)^{\mu + \nu}$. The determinant of a square matrix can be recursively computed by either of the following equations.

$\displaystyle det A = \sum_{\mu = 1}^{n} a_{\mu\nu}\cdot A_{\mu\nu} \quad$ using the $ \nu$-th column (15.61)
$\displaystyle det A = \sum_{\nu = 1}^{n} a_{\mu\nu}\cdot A_{\mu\nu} \quad$ using the $ \mu$-th row (15.62)

This method is called the Laplace expansion. In order to save computing time the row or column with the most zeros in it is used for the expansion expressed in the above equations. A sub determinant $ (n-1)$-th order of a matrix's element $ a_{\mu\nu}$ of $ n$-th order is the determinant which is computed by cancelling the $ \mu$-th row and $ \nu$-th column. The following example demonstrates calculating the determinant of a 4th order matrix with the elements of the 3rd row.

$\displaystyle \begin{vmatrix}a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a_{22...
... & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44}\\ \end{vmatrix}$ $\displaystyle = a_{31} \begin{vmatrix}a_{12} & a_{13} & a_{14}\\ a_{22} & a_{23...
... & a_{14}\\ a_{21} & a_{23} & a_{24}\\ a_{41} & a_{43} & a_{44}\\ \end{vmatrix}$ (15.63)
  $\displaystyle + a_{33} \begin{vmatrix}a_{11} & a_{12} & a_{14}\\ a_{21} & a_{22...
... & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{41} & a_{42} & a_{43}\\ \end{vmatrix}$    

This recursive process for computing the inverse of a matrix is most easiest to be implemented but as well the slowest algorithm. It requires approximately $ n!$ operations.

Gaussian elimination

The Gaussian algorithm for solving a linear equation system is done in two parts: forward elimination and backward substitution. During forward elimination the matrix A is transformed into an upper triangular equivalent matrix. Elementary transformations due to an equation system having the same solutions for the unknowns as the original system.

$\displaystyle A = \begin{bmatrix}a_{11} & a_{12} & \ldots & a_{1n}\\ a_{21} & a...
...2n}\\ \vdots & \vdots & \ddots & \vdots\\ 0 & \ldots & 0 & a_{nn} \end{bmatrix}$ (15.64)

The modifications applied to the matrix A in order to achieve this transformations are limited to the following set of operations.

Step 1: Forward elimination

The transformation of the matrix A is done in $ \mathrm{n - 1}$ elimination steps. The new matrix elements of the k-th step with $ \mathrm{k = 1, \ldots, n - 1}$ are computed with the following recursive formulas.

$\displaystyle a_{ij}$ $\displaystyle = 0$ $\displaystyle i = k+1, \ldots, n$ and $\displaystyle j = k$ (15.65)
$\displaystyle a_{ij}$ $\displaystyle = a_{ij} - a_{kj} \cdot a_{ik} / a_{kk}$ $\displaystyle i = k+1, \ldots, n$ and $\displaystyle j = k+1, \ldots, n$ (15.66)
$\displaystyle z_{i}$ $\displaystyle = z_{i} - z_{k} \cdot a_{ik} / a_{kk}$ $\displaystyle i = k+1, \ldots, n$   (15.67)

The triangulated matrix can be used to calculate the determinant very easily. The determinant of a triangulated matrix is the product of the diagonal elements. If the determinant $ det A$ is non-zero the equation system has a solution. Otherwise the matrix A is singular.

$\displaystyle det A = a_{11}\cdot a_{22}\cdot \ldots \cdot a_{nn} = \prod_{i=1}^{n} a_{ii}$ (15.68)

When using row and/or column pivoting the resulting determinant may differ in its sign and must be multiplied with $ (-1)^m$ whereas $ m$ is the number of row and column substitutions.

Finding an appropriate pivot element

The Gaussian elimination fails if the pivot element $ a_{kk}$ turns to be zero (division by zero). That is why row and/or column pivoting must be used before each elimination step. If a diagonal element $ a_{kk} = 0$, then exchange the pivot row $ k$ with the row $ m > k$ having the coefficient with the largest absolute value. The new pivot row is $ m$ and the new pivot element is going to be $ a_{mk}$. If no such pivot row can be found the matrix is singular.

Total pivoting looks for the element with the largest absolute value within the matrix and exchanges rows and columns. When exchanging columns in equation systems the unknowns get reordered as well. For the numerical solution of equation systems with Gaussian elimination column pivoting is clever, and total pivoting recommended.

In order to improve numerical stability pivoting should also be applied if $ a_{kk} \ne 0$ because division by small diagonal elements propagates numerical (rounding) errors. This appears especially with poorly conditioned (the two dimensional case: two lines with nearly the same slope) equation systems.

Step 2: Backward substitution

The solutions in the vector x are obtained by backward substituting into the triangulated matrix. The elements of the solution vector x are computed by the following recursive equations.

$\displaystyle x_{n}$ $\displaystyle = \frac{z_{n}}{a_{nn}}$ (15.69)
$\displaystyle x_{i}$ $\displaystyle = \frac{z_{i}}{a_{ii}} - \sum_{k=i+1}^{n} x_{k}\cdot \frac{a_{ik}}{a_{ii}}$ $\displaystyle i = n - 1,\ldots,1$ (15.70)

The forward elimination in the Gaussian algorithm requires approximately $ n^3/3$, the backward substitution $ n^2/2$ operations.

Gauss-Jordan method

The Gauss-Jordan method is a modification of the Gaussian elimination. In each k-th elimination step the elements of the k-th column get zero except the diagonal element which gets 1. When the right hand side vector z is included in each step it contains the solution vector x afterwards.

The following recursive formulas must be applied to get the new matrix elements for the k-th elimination step. The k-th row must be computed first.

$\displaystyle a_{kj}$ $\displaystyle = a_{kj} / a_{kk}$ $\displaystyle j = 1\ldots n$ (15.71)
$\displaystyle z_{k}$ $\displaystyle = z_{k} / a_{kk}$   (15.72)

Then the other rows can be calculated with the following formulas.

$\displaystyle a_{ij}$ $\displaystyle = a_{ij} - a_{ik}\cdot a_{kj}$ $\displaystyle j = 1,\ldots,n \textrm{ and } i = 1,\ldots,n \textrm{ with } i \ne k$ (15.73)
$\displaystyle z_{i}$ $\displaystyle = z_{i} - a_{ik}\cdot z_{k}$ $\displaystyle i = 1,\ldots,n \textrm{ with } i \ne k$ (15.74)

Column pivoting may be necessary in order to avoid division by zero. The solution vector x is not harmed by row substitutions. When the Gauss-Jordan algorithm has been finished the original matrix has been transformed into the identity matrix. If each operation during this process is applied to an identity matrix the resulting matrix is the inverse matrix of the original matrix. This means that the Gauss-Jordan method can be used to compute the inverse of a matrix.

Though this elimination method is easy to implement the number of required operations is larger than within the Gaussian elimination. The Gauss-Jordan method requires approximately $ N^3/2 + N^2/2$ operations.

LU decomposition

LU decomposition (decomposition into a lower and upper triangular matrix) is recommended when dealing with equation systems where the matrix A does not alter but the right hand side (the vector z) does. Both the Gaussian elimination and the Gauss-Jordan method involve both the right hand side and the matrix in their algorithm. Consecutive solutions of an equation system with an altering right hand side can be computed faster with LU decomposition.

The LU decomposition splits a matrix A into a product of a lower triangular matrix L with an upper triangular matrix U.

$\displaystyle A = L U \;$ with $\displaystyle \; L = \begin{bmatrix}l_{11} & 0 & \ldots & 0\\ l_{21} & l_{22} &...
...ts\\ \vdots & & \ddots & 0\\ l_{n1} & \ldots & \ldots & l_{nn} \end{bmatrix} \;$ and $\displaystyle \; U = \begin{bmatrix}u_{11} & u_{12} & \ldots & u_{1n}\\ 0 & u_{...
...ots\\ \vdots & \ddots & \ddots & \vdots\\ 0 & \ldots & 0 & u_{nn} \end{bmatrix}$ (15.75)

The algorithm for solving the linear equation system $ Ax = z$ involves three steps:

The decomposition of the matrix A into a lower and upper triangular matrix is not unique. The most important decompositions, based on Gaussian elimination, are the Doolittle, the Crout and the Cholesky decomposition.

If pivoting is necessary during these algorithms they do not decompose the matrix $ A$ but the product with an arbitrary matrix $ PA$ (a permutation of the matrix $ A$). When exchanging rows and columns the order of the unknowns as represented by the vector $ z$ changes as well and must be saved during this process for the forward substitution in the algorithms second step.

Step 1: LU decomposition

Using the decomposition according to Crout the coefficients of the L and U matrices can be stored in place the original matrix A. The upper triangular matrix U has the form

$\displaystyle U = \begin{bmatrix}1 & u_{12} & \ldots & u_{1n}\\ 0 & 1 & & \vdots\\ \vdots & \ddots & \ddots & u_{n-1,n}\\ 0 & \ldots & 0 & 1 \end{bmatrix}$ (15.76)

The diagonal elements $ u_{jj}$ are ones and thus the determinant $ det
U$ is one as well. The elements of the new coefficient matrix $ LU$ for the k-th elimination step with $ k = 1, \ldots,n$ compute as follows:

$\displaystyle u_{jk}$ $\displaystyle = \frac{1}{l_{jj}}\left(a_{jk} - \sum_{r=1}^{j-1} l_{jr} u_{rk}\right)$ $\displaystyle j$ $\displaystyle = 1,\ldots,k-1$ (15.77)
$\displaystyle l_{jk}$ $\displaystyle = a_{jk} - \sum_{r=1}^{k-1} l_{jr} u_{rk}$ $\displaystyle j$ $\displaystyle = k,\ldots,n$ (15.78)

Pivoting may be necessary as you are going to divide by the diagonal element $ l_{jj}$.

Step 2: Forward substitution

The solutions in the arbitrary vector $ y$ are obtained by forward substituting into the triangulated $ L$ matrix. At this stage you need to remember the order of unknowns in the vector $ z$ as changed by pivoting. The elements of the solution vector $ y$ are computed by the following recursive equation.

$\displaystyle y_{i}$ $\displaystyle = \frac{z_{i}}{l_{ii}} - \sum_{k=1}^{i-1} y_{k}\cdot \frac{l_{ik}}{l_{ii}}$ $\displaystyle i = 1,\ldots,n$ (15.79)

Step 3: Backward substitution

The solutions in the vector $ x$ are obtained by backward substituting into the triangulated $ U$ matrix. The elements of the solution vector $ x$ are computed by the following recursive equation.

$\displaystyle x_{i}$ $\displaystyle = y_{i} - \sum_{k=i+1}^{n} x_{k}\cdot u_{ik}$ $\displaystyle i = n,\ldots,1$ (15.80)

The division by the diagonal elements of the matrix U is not necessary because of Crouts definition in eq. (15.76) with $ u_{ii} =

The LU decomposition requires approximately $ n^3/3 + n^2 - n/3$ operations for solving a linear equation system. For $ M$ consecutive solutions the method requires $ n^3/3 + Mn^2 - n/3$ operations.

QR decomposition

Singular matrices actually having a solution are over- or under-determined. These types of matrices can be handled by three different types of decompositions: Householder, Jacobi (Givens rotation) and singular value decomposition. Householder decomposition factors a matrix $ A$ into the product of an orthonormal matrix $ Q$ and an upper triangular matrix $ R$, such that:

$\displaystyle A = Q\cdot R$ (15.81)

The Householder decomposition is based on the fact that for any two different vectors, $ v$ and $ w$, with $ \left\lVert v\right\rVert =
\left\lVert w\right\rVert$, i.e. different vectors of equal length, a reflection matrix $ H$ exists such that:

$\displaystyle H \cdot v = w$ (15.82)

To obtain the matrix $ H$, the vector $ u$ is defined by:

$\displaystyle u = \dfrac{v - w}{\left\lVert v - w\right\rVert}$ (15.83)

The matrix $ H$ defined by

$\displaystyle H = I - 2 \cdot u \cdot u^T$ (15.84)

is then the required reflection matrix.

The equation system

$\displaystyle A\cdot x = z \;\;\;\; \textrm{ is transformed into } \;\;\;\; Q R\cdot x = z$ (15.85)

With $ Q^T\cdot Q = I$ this yields

$\displaystyle Q^T Q R\cdot x = Q^T z \;\;\;\; \rightarrow \;\;\;\; R\cdot x = Q^T z$ (15.86)

Since $ R$ is triangular the equation system is solved by a simple matrix-vector multiplication on the right hand side and backward substitution.

Step 1: QR decomposition

Starting with $ A_1 = A$, let $ v_1$ = the first column of $ A_1$, and $ w_1^T = \left(\pm\lVert v_1\rVert , 0 , \ldots 0\right)$, i.e. a column vector whose first component is the norm of $ v_1$ with the remaining components equal to 0. The Householder transformation $ H_1 = I - 2
\cdot u_1 \cdot u_1^T$ with $ u_1 = v_1 - w_1 / \lVert v_1 - w_1
\rVert$ will turn the first column of $ A_1$ into $ w_1$ as with $ H_1
\cdot A_1 = A_2$. At each stage $ k$, $ v_k$ = the kth column of $ A_k$ on and below the diagonal with all other components equal to 0, and $ w_k$'s kth component equals the norm of $ v_k$ with all other components equal to 0. Letting $ H_k \cdot A_k = A_{k+1}$, the components of the kth column of $ A_{k+1}$ below the diagonal are each 0. These calculations are listed below for each stage for the matrix A.

\begin{displaymath}\begin{split}v_1 = \begin{bmatrix}a_{11}\\ a_{21}\\ \vdots\\ ...
...\vdots\\ 0 & a_{n2} & \ldots & a_{nn} \end{bmatrix} \end{split}\end{displaymath} (15.87)

With this first step the upper left diagonal element of the $ R$ matrix, $ a_{11} = \pm\lVert v_1 \rVert$, has been generated. The elements below are zeroed out. Since $ H_1$ can be generated from $ u_1$ stored in place of the first column of $ A_1$ the multiplication $ H_1 \cdot A_1$ can be performed without actually generating $ H_1$.

\begin{displaymath}\begin{split}v_2 = \begin{bmatrix}0\\ a_{22}\\ \vdots\\ a_{n2...
... & \ddots & \vdots\\ 0 & 0 & & a_{nn} \end{bmatrix} \end{split}\end{displaymath} (15.88)

These elimination steps generate the $ R$ matrix because $ Q$ is orthonormal, i.e.

\begin{displaymath}\begin{split}A = Q\cdot R \;\;\; \rightarrow \;\;\; Q^T A = Q...
...\; H_n \cdot \ldots \cdot H_2 \cdot H_1 \cdot A = R \end{split}\end{displaymath} (15.89)

After $ n$ elimination steps the original matrix $ A$ contains the upper triangular matrix $ R$, except for the diagonal elements which can be stored in some vector. The lower triangular matrix contains the Householder vectors $ u_1 \ldots u_n$.

$\displaystyle A = \begin{bmatrix}u_{11} & r_{12} & \ldots & r_{1n}\\ u_{21} & u...
...;\;\; R_{diag} = \begin{bmatrix}r_{11}\\ r_{22}\\ \vdots\\ r_{nn} \end{bmatrix}$ (15.90)

With $ Q^T = H_1 \cdot H_2 \cdot \ldots \cdot H_n$ this representation contains both the $ Q$ and $ R$ matrix, in a packed form, of course: $ Q$ as a composition of Householder vectors and $ R$ in the upper triangular part and its diagonal vector $ R_{diag}$.

Step 2: Forming the new right hand side

In order to form the right hand side $ Q^T z$ let remember eq. (15.84) denoting the reflection matrices used to compute $ Q^T$.

$\displaystyle H_n\cdot \ldots \cdot H_2\cdot H_1 = Q^T$ (15.91)

Thus it is possible to replace the original right hand side vector $ z$ by

$\displaystyle H_n\cdot \ldots \cdot H_2\cdot H_1\cdot z = Q^T \cdot z$ (15.92)

which yields for each $ k = 1\ldots n$ the following expression:

$\displaystyle H_k \cdot z = \left(I - 2\cdot u_k \cdot u_k^T\right)\cdot z = z - 2\cdot u_k \cdot u_k^T\cdot z$ (15.93)

The latter $ u_k^T\cdot z$ is a simple scalar product of two vectors. Performing eq. (15.93) for each Householder vector finally results in the new right hand side vector $ Q^T z$.

Step 3: Backward substitution

The solutions in the vector $ x$ are obtained by backward substituting into the triangulated $ R$ matrix. The elements of the solution vector $ x$ are computed by the following recursive equation.

$\displaystyle x_{i}$ $\displaystyle = \dfrac{z_{i}}{r_{ii}} - \sum_{k=i+1}^{n} x_{k}\cdot \dfrac{r_{ik}}{r_{ii}}$ $\displaystyle i = n,\ldots,1$ (15.94)


Though the QR decomposition has an operation count of $ 2n^3 + 3n^2$ (which is about six times more than the LU decomposition) it has its advantages. The QR factorization method is said to be unconditional stable and more accurate. Also it can be used to obtain the minimum-norm (or least square) solution of under-determined equation systems.

Figure 15.3: circuit with singular modified nodal analysis matrix

The circuit in fig. 15.3 has the following MNA representation:

$\displaystyle A x = \begin{bmatrix}\frac{1}{R_{2}} & 0 & 0\\ 0 & \frac{1}{R_{1}...
...\\ -I_1\\ I_2 \end{bmatrix} = \begin{bmatrix}0.1\\ -0.1\\ 0.1 \end{bmatrix} = z$ (15.95)

The second and third row of the matrix $ A$ are linear dependent and the matrix is singular because its determinant is zero. Depending on the right hand side $ z$, the equation system has none or unlimited solutions. This is called an under-determined system. The discussed QR decomposition easily computes a valid solution without reducing accuracy. The LU decomposition would probably fail because of the singularity.

QR decomposition with column pivoting

Least norm problem

With some more effort it is possible to obtain the minimum-norm solution of this problem. The algorithm as described here would probably yield the following solution:

$\displaystyle x = \begin{bmatrix}V_1\\ V_2\\ V_3 \end{bmatrix} = \begin{bmatrix}1\\ 0\\ 1 \end{bmatrix}$ (15.96)

This is one out of unlimited solutions. The following short description shows how it is possible to obtain the minimum-norm solution. When decomposing the transposed problem

$\displaystyle A^T = Q\cdot R$ (15.97)

the minimum-norm solution $ \hat{x}$ is obtained by forward substitution of

$\displaystyle R^T\cdot x = z$ (15.98)

and multiplying the result with $ Q$.

$\displaystyle \hat{x} = Q\cdot x$ (15.99)

In the example above this algorithm results in a solution vector with the least vector norm possible:

$\displaystyle \hat{x} = \begin{bmatrix}V_1\\ V_2\\ V_3 \end{bmatrix} = \begin{bmatrix}1\\ -0.5\\ 0.5 \end{bmatrix}$ (15.100)

This algorithm outline is also sometimes called LQ decomposition because of $ R^T$ being a lower triangular matrix used by the forward substitution.

Singular value decomposition

Very bad conditioned (ratio between largest and smallest eigenvalue) matrices, i.e. nearly singular, or even singular matrices (over- or under-determined equation systems) can be handled by the singular value decomposition (SVD). This type of decomposition is defined by

$\displaystyle A = U\cdot \Sigma\cdot V^H$ (15.101)

where the $ U$ matrix consists of the orthonormalized eigenvectors associated with the eigenvalues of $ A\cdot A^H$, $ V$ consists of the orthonormalized eigenvectors of $ A^H\cdot A$ and $ \Sigma$ is a matrix with the singular values of $ A$ (non-negative square roots of the eigenvalues of $ A^H\cdot A$) on its diagonal and zeros otherwise.

$\displaystyle \Sigma = \begin{bmatrix}\sigma_1 & 0 & \cdots & 0\\ 0 & \sigma_2 ...
...0\\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & \sigma_n \end{bmatrix}$ (15.102)

The singular value decomposition can be used to solve linear equation systems by simple substitutions

$\displaystyle A\cdot x$ $\displaystyle = z$ (15.103)
$\displaystyle U\cdot \Sigma\cdot V^H\cdot x$ $\displaystyle = z$ (15.104)
$\displaystyle \Sigma\cdot V^H\cdot x$ $\displaystyle = U^H\cdot z$ (15.105)


$\displaystyle U^H\cdot U = V^H\cdot V = V\cdot V^H = I$ (15.106)

To obtain the decomposition stated in eq. (15.101) Householder vectors are computed and their transformations are applied from the left-hand side and right-hand side to obtain an upper bidiagonal matrix $ B$ which has the same singular values as the original $ A$ matrix because all of the transformations introduced are orthogonal.

$\displaystyle U_B^{H\,(n)}\cdot\ldots\cdot U_B^{H\,(1)}\cdot A\cdot V_B^{(1)}\cdot\ldots\cdot V_B^{(n-2)} = U_B^H\cdot A\cdot V_B = B^{(0)}$ (15.107)

Specifically, $ U_B^{H\,(i)}$ annihilates the subdiagonal elements in column $ i$ and $ V_B^{(j)}$ zeros out the appropriate elements in row $ j$.

$\displaystyle B^{(0)} = \begin{bmatrix}\beta_1 & \delta_2 & 0 & \cdots & 0\\ 0 ...
...\\ 0 & 0 & 0 & \beta_{n-1} & \delta_n\\ 0 & 0 & 0 & 0 & \beta_n\\ \end{bmatrix}$ (15.108)

Afterwards an iterative process (which turns out to be a QR iteration) is used to transform the bidiagonal matrix $ B$ into a diagonal form by applying successive Givens transformations (therefore orthogonal as well) to the bidiagonal matrix. This iteration is said to have cubic convergence and yields the final singular values of the matrix $ A$.

$\displaystyle B^{(0)} \rightarrow B^{(1)}\rightarrow \ldots \rightarrow \Sigma$ (15.109)

$\displaystyle B^{(k+1)} = \left(\tilde{U}^{(k)}\right)^H\cdot B^{(k)}\cdot \tilde{V}^{(k)}$ (15.110)

Each of the transformations applied to the bidiagonal matrix is also applied to the matrices $ U_B$ and $ V_B^H$ which finally yield the $ U$ and $ V^H$ matrices after convergence.

So far for the algorithm outline. Without the very details the following sections briefly describe each part of the singular value decomposition.


Beforehand some notation marks are going to be defined.

Householder reflector

A Householder matrix is an elementary unitary matrix that is Hermitian. Its fundamental use is their ability to transform a vector $ x$ to a multiple of $ \vec{e}_1$, the first column of the identity matrix. The elementary Hermitian (i.e. the Householder matrix) is defined as

$\displaystyle H = I - 2\cdot u\cdot u^H \;\;\;\; \textrm{ where } \;\;\;\; u^H\cdot u = 1$ (15.111)

Beside excellent numerical properties, their application demonstrates their efficiency. If $ A$ is a matrix, then

$\displaystyle H\cdot A$ $\displaystyle = A - 2\cdot u\cdot u^H\cdot A$ (15.112)
  $\displaystyle = A - 2\cdot u\cdot \left(A^H\cdot u\right)^H$    

and hence explicit formation and storage of $ H$ is not required. Also columns (or rows) can be transformed individually exploiting the fact that $ u^H\cdot A$ yields a scalar product for single columns or rows.

Specific case

In order to reduce a 4$ \times$4 matrix $ A$ to upper triangular form successive Householder reflectors must be applied.

$\displaystyle A = \begin{bmatrix}a_{11} & a_{12} & a_{13} & a_{14}\\ a_{21} & a...
...1} & a_{32} & a_{33} & a_{34}\\ a_{41} & a_{42} & a_{43} & a_{44} \end{bmatrix}$ (15.113)

In the first step the diagonal element $ a_{11}$ gets replaced and its below elements get annihilated by the multiplication with an appropriate Householder vector, also the remaining right-hand columns get modified.

$\displaystyle u_1 = \begin{bmatrix}u_{11}\\ u_{21}\\ u_{31}\\ u_{41} \end{bmatr...
...} & a_{34}^{(1)}\\ 0 & a_{42}^{(1)} & a_{43}^{(1)} & a_{44}^{(1)} \end{bmatrix}$ (15.114)

This process must be repeated

$\displaystyle u_2 = \begin{bmatrix}0\\ u_{22}\\ u_{32}\\ u_{42} \end{bmatrix} \...
...a_{33}^{(2)} & a_{34}^{(2)}\\ 0 & 0 & a_{43}^{(2)} & a_{44}^{(2)} \end{bmatrix}$ (15.115)

$\displaystyle u_3 = \begin{bmatrix}0\\ 0\\ u_{33}\\ u_{43} \end{bmatrix} \;\;\;...
...{(3)}\\ 0 & 0 & \beta_3 & a_{34}^{(3)}\\ 0 & 0 & 0 & a_{44}^{(3)} \end{bmatrix}$ (15.116)

$\displaystyle u_4 = \begin{bmatrix}0\\ 0\\ 0\\ u_{44} \end{bmatrix} \;\;\;\; H_...
...{24}^{(4)}\\ 0 & 0 & \beta_3 & a_{34}^{(4)}\\ 0 & 0 & 0 & \beta_4 \end{bmatrix}$ (15.117)

until the matrix $ A$ contains an upper triangular matrix $ R$. The matrix $ Q$ can be expressed as the the product of the Householder vectors. The performed operations deliver

$\displaystyle H^H_4\cdot H^H_3\cdot H^H_2\cdot H^H_1 \cdot A = Q^H\cdot A = R \;\;\;\; \rightarrow \;\;\;\; A = Q\cdot R$ (15.118)

since $ Q$ is unitary. The matrix $ Q$ itself can be expressed in terms of $ H_i$ using the following transformation.

$\displaystyle Q^H$ $\displaystyle = H^H_4\cdot H^H_3\cdot H^H_2\cdot H^H_1$ (15.119)
$\displaystyle \left(Q^H\right)^H$ $\displaystyle = \left(H^H_4\cdot H^H_3\cdot H^H_2\cdot H^H_1\right)^H$ (15.120)
$\displaystyle Q$ $\displaystyle = H_1\cdot H_2\cdot H_3\cdot H_4$ (15.121)

The eqn. (15.119)-(15.121) are necessary to be mentioned only in case $ Q$ is not Hermitian, but still unitary. Otherwise there is no difference computing $ Q$ or $ Q^H$ using the Householder vectors. No care must be taken in choosing forward or backward accumulation.

General case

In the general case it is necessary to find an elementary unitary matrix

$\displaystyle H = I - \tau\cdot u\cdot u^H$ (15.122)

that satisfies the following three conditions.

$\displaystyle \left\vert\tau\right\vert^2\cdot u^H\cdot u = \tau + \tau^* = 2\c...
...ot \lVert x \rVert \cdot \vec{e}_1 \;\;\;\; , \;\;\;\; \lvert \gamma \rvert = 1$ (15.123)

When choosing the elements $ u_{ii} =
1$ it is possible the store the Householder vectors as well as the upper triangular matrix $ R$ in the same storage of the matrix $ A$. The Householder matrices $ H_i$ can be completely restored from the Householder vectors.

$\displaystyle A = \begin{bmatrix}\beta_1 & a_{12} & a_{13} & a_{14}\\ u_{21} & ...
... & u_{32} & \beta_3 & a_{34}\\ u_{41} & u_{42} & u_{43} & \beta_4 \end{bmatrix}$ (15.124)

There exist several approaches to meet the conditions expressed in eq. (15.123). For fewer computational effort it may be convenient to choose $ \gamma$ to be real valued. With the notation

$\displaystyle H^H\cdot x = H^H\cdot \begin{bmatrix}\alpha\\ x_2\\ x_3\\ x_4\\ \end{bmatrix} = \begin{bmatrix}\beta\\ 0\\ 0\\ 0\\ \end{bmatrix}$ (15.125)

one possibility is to define the following calculation rules.

$\displaystyle \nu$ $\displaystyle = sign\left(Re\left\{\alpha\right\}\right)\cdot \lVert x \rVert$ (15.126)
$\displaystyle \tau$ $\displaystyle = \dfrac{\alpha + \nu}{\nu}$ (15.127)
$\displaystyle \gamma$ $\displaystyle = -1$ (15.128)
$\displaystyle \beta$ $\displaystyle = \gamma \cdot \lVert x \rVert = - \lVert x \rVert \;\;\;\; \rightarrow \;\;\;\; \textrm{real valued}$ (15.129)
$\displaystyle u$ $\displaystyle = \dfrac{x + \nu\cdot \vec{e}_1}{\alpha + \nu} \;\;\;\; \rightarrow \;\;\;\; u_{ii} = 1$ (15.130)

These definitions yield a complex $ \tau$, thus $ H$ is no more Hermitian but still unitary.

$\displaystyle H = I - \tau\cdot u\cdot u^H \;\;\;\; \rightarrow \;\;\;\; H^H = I - \tau^*\cdot u\cdot u^H$ (15.131)

Givens rotation

A Givens rotation is a plane rotation matrix. Such a plane rotation matrix is an orthogonal matrix that is different from the identity matrix only in four elements.

$\displaystyle M = \left[ \begin{array}{ccccccccccc} 1 & 0 & \cdots & & & & & & ...
... & & & \ddots & 0\\ 0 & \cdots & & & & & & & \cdots & 0 & 1 \end{array} \right]$ (15.132)

The elements are usually chosen so that

$\displaystyle R = \left[\begin{array}{rl} c & s\\ -s & c\\ \end{array}\right] \...
... \rightarrow \;\;\;\; \left\vert c\right\vert^2 + \left\vert s\right\vert^2 = 1$ (15.133)

The most common use of such a plane rotation is to choose $ c$ and $ s$ such that for a given $ a$ and $ b$

$\displaystyle R = \left[\begin{array}{rl} c & s\\ -s & c\\ \end{array}\right] \cdot \begin{bmatrix}a\\ b\\ \end{bmatrix} = \begin{bmatrix}d\\ 0\\ \end{bmatrix}$ (15.134)

multiplication annihilates the lower vector entry. In such an application the matrix $ R$ is often termed ``Givens rotation'' matrix. The following equations satisfy eq. (15.134) for a given $ a$ and $ b$ exploiting the conditions given in eq. (15.133).

$\displaystyle c = \dfrac{\pm a}{\sqrt{\left\vert a\right\vert^2 + \left\vert b\...
...s = \dfrac{\pm b}{\sqrt{\left\vert a\right\vert^2 + \left\vert b\right\vert^2}}$ (15.135)

$\displaystyle d = \sqrt{\left\vert a\right\vert^2 + \left\vert b\right\vert^2}$ (15.136)

Eigenvalues of a 2-by-2 matrix

The eigenvalues of a 2-by-2 matrix

$\displaystyle A = \begin{bmatrix}a & b\\ c & d \end{bmatrix}$ (15.137)

can be obtained directly from the quadratic formula. The characteristic polynomial is

\begin{displaymath}\begin{split}det\left(A -\mu I\right) = det \begin{bmatrix}a ...
...- \left(a + d\right)\cdot\mu + \left(ad - bc\right) \end{split}\end{displaymath} (15.138)

The polynomial yields the two eigenvalues.

$\displaystyle \mu_{1,2} = \dfrac{a + d}{2} \pm \sqrt{\left(\dfrac{a + d}{2}\right)^2 + bc - ad}$ (15.139)

For a symmetric matrix $ A$ (i.e. $ b = c$) eq.(15.139) can be rewritten to:

$\displaystyle \mu_{1,2} = e + d \pm \sqrt{e^2 + b^2} \;\;\;\; \textrm{ with } \;\;\;\; e = \dfrac{a - d}{2}$ (15.140)

Step 1: Bidiagonalization

In the first step the original matrix $ A$ is bidiagonalized by the application of Householder reflections from the left and right hand side. The matrices $ U_B^H$ and $ V_B$ can each be determined as a product of Householder matrices.

$\displaystyle \underbrace{U_B^{H\,(n)}\cdot\ldots\cdot U_B^{H\,(1)}}_{U_B^H}\cd...
...{V_B^{(1)}\cdot\ldots\cdot V_B^{(n-2)}}_{V_B} = U_B^H\cdot A\cdot V_B = B^{(0)}$ (15.141)

Each of the required Householder vectors are created and applied as previously defined. Suppose a $ n\times n$ matrix, then applying the first Householder vector from the left hand side eliminates the first column and yields

$\displaystyle U_B^{H\,(1)} \cdot A$ $\displaystyle = \begin{bmatrix}\boldsymbol{\beta_1} & a_{12}^{(1)} & a_{13}^{(1...
...\ u_{n1} & a_{n2}^{(1)} & a_{n3}^{(1)} & \cdots & a_{nn}^{(1)} \\ \end{bmatrix}$ (15.142)

Next, a Householder vector is applied from the right hand side to annihilate the first row.

$\displaystyle U_B^{H\,(1)} \cdot A \cdot V_B^{(1)}$ $\displaystyle = \begin{bmatrix}\beta_1 & \boldsymbol{\delta_2} & v_{13} & \cdot...
...\ u_{n1} & a_{n2}^{(2)} & a_{n3}^{(2)} & \cdots & a_{nn}^{(2)} \\ \end{bmatrix}$ (15.143)

Again, a Householder vector is applied from the left hand side to annihilate the second column.

$\displaystyle U_B^{H\,(2)} \cdot U_B^{H\,(1)} \cdot A \cdot V_B^{(1)}$ $\displaystyle = \begin{bmatrix}\beta_1 & \delta_2 & v_{13} & \cdots & v_{1n} \\...
...dots \\ u_{n1} & u_{n2} & a_{n3}^{(3)} & \cdots & a_{nn}^{(3)} \\ \end{bmatrix}$ (15.144)

This process is continued until

$\displaystyle U_B^H \cdot A \cdot V_B$ $\displaystyle = \begin{bmatrix}\beta_1 & \delta_2 & v_{13} & \cdots & v_{1n} \\...
... \delta_n \\ u_{n1} & u_{n2} & u_{n3} & & \boldsymbol{\beta_n} \\ \end{bmatrix}$ (15.145)

For each of the Householder transformations from the left and right hand side the appropriate $ \tau$ values must be stored in separate vectors.

Step 2: Matrix reconstructions

Using the Householder vectors stored in place of the original $ A$ matrix and the appropriate $ \tau$ value vectors it is now necessary to unpack the $ U_B$ and $ V_B^H$ matrices. The diagonal vector $ \beta$ and the super-diagonal vector $ \delta$ can be saved in separate vectors previously. Thus the $ U_B$ matrix can be unpacked in place of the $ A$ matrix and the $ V_B^H$ matrix is unpacked in a separate matrix.

There are two possible algorithms for computing the Householder product matrices, i.e. forward accumulation and backward accumulation. Both start with the identity matrix which is successively multiplied by the Householder matrices either from the left or right.

$\displaystyle U_B^H$ $\displaystyle = H^H_{Un}\cdot \ldots \cdot H^H_{U2}\cdot H^H_{U1}\cdot I$ (15.146)
$\displaystyle \rightarrow\;\;\;\; U_B\;$ $\displaystyle = I\cdot H_{Un}\cdot \ldots \cdot H_{U2}\cdot H_{U1}$ (15.147)

Recall that the leading portion of each Householder matrix is the identity except the first. Thus, at the beginning of backward accumulation, $ U_B$ is ``mostly the identity'' and it gradually becomes full as the iteration progresses. This pattern can be exploited to reduce the number of required flops. In contrast, $ U_B^H$ is full in forward accumulation after the first step. For this reason, backward accumulation is cheaper and the strategy of choice. When unpacking the $ U_B$ matrix in place of the original $ A$ matrix it is necessary to choose backward accumulation anyway.

$\displaystyle V_B\;$ $\displaystyle = I\cdot H^H_{V1}\cdot H^H_{V2}\cdot \ldots \cdot H^H_{Vn}$ (15.148)
$\displaystyle \rightarrow\;\;\;\; V_B^H$ $\displaystyle = I\cdot H_{Vn}\cdot \ldots \cdot H_{V2}\cdot H_{V1}$ (15.149)

Unpacking the $ V_B^H$ matrix is done in a similar way also performing successive Householder matrix multiplications using backward accumulation.

Step 3: Diagonalization - shifted QR iteration

At this stage the matrices $ U_B$ and $ V_B^H$ exist in unfactored form. Also there are the diagonal vector $ \beta$ and the super-diagonal vector $ \delta$. Both vectors are real valued. Thus the following algorithm can be applied even though solving a complex equation system.

$\displaystyle B^{(0)} = \begin{bmatrix}\beta_1 & \delta_2 & 0 & \cdots & 0\\ 0 ...
...\\ 0 & 0 & 0 & \beta_{n-1} & \delta_n\\ 0 & 0 & 0 & 0 & \beta_n\\ \end{bmatrix}$ (15.150)

The remaining problem is thus to compute the SVD of the matrix $ B$. This is done applying an implicit-shift QR step to the tridiagonal matrix $ T=B^T B$ which is a symmetric. The matrix $ T$ is not explicitly formed that is why a QR iteration with implicit shifts is applied.

After bidiagonalization we have a bidiagonal matrix $ B^{(0)}$:

$\displaystyle B^{(0)} = U_B^H \cdot A \cdot V_B$ (15.151)

The presented method turns $ B^{(k)}$ into a matrix $ B^{(k+1)}$ by applying a set of orthogonal transforms

$\displaystyle B^{(k+1)} = \tilde{U}^H\cdot B^{(k)} \cdot \tilde{V}$ (15.152)

The orthogonal matrices $ \tilde{U}$ and $ \tilde{V}$ are chosen so that $ B^{(k+1)}$ is also a bidiagonal matrix, but with the super-diagonal elements smaller than those of $ B^{(k)}$. The eq.(15.152) is repeated until the non-diagonal elements of $ B^{(k+1)}$ become smaller than $ \varepsilon$ and can be disregarded.

The matrices $ \tilde{U}$ and $ \tilde{V}$ are constructed as

$\displaystyle \tilde{U} = \tilde{U}_1 \cdot \tilde{U}_2 \cdot \tilde{U}_3\cdot \dots \cdot \tilde{U}_{n-1}$ (15.153)

and similarly $ \tilde{V}$ where $ \tilde{V}_i$ and $ \tilde{U}_i$ are matrices of simple rotations as given in eq.(15.132). Both $ \tilde{V}$ and $ \tilde{U}$ are products of Givens rotations and thus perform orthogonal transforms.

Single shifted QR step.

The left multiplication of $ B^{(k)}$ by $ \tilde{U}_i^H$ replaces two rows of $ B^{(k)}$ by their linear combinations. The rest of $ B^{(k)}$ is unaffected. Right multiplication of $ B^{(k)}$ by $ \tilde{V}_i$ similarly changes only two columns of $ B^{(k)}$.

A matrix $ \tilde{V}_1$ is chosen the way that

$\displaystyle B^{(k)}_1 = B^{(k)}_0 \cdot \tilde{V}_1$ (15.154)

is a QR transform with a shift. Note that multiplying $ B^{(k)}$ by $ \tilde{V}_1$ gives rise to a non-zero element which is below the main diagonal.

$\displaystyle B^{(k)}_0 \cdot \tilde{V}_1 = \begin{bmatrix}\times & \times & 0 ...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$ (15.155)

A new rotation angle is then chosen so that multiplication by $ \tilde{U}_1^H$ gets rid of that element. But this will create a non-zero element which is right beside the super-diagonal.

$\displaystyle \tilde{U}_1^H \cdot B^{(k)}_1 = \begin{bmatrix}\times & \times & ...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$ (15.156)

Then $ \tilde{V}_2$ is made to make it disappear, but this leads to another non-zero element below the diagonal, etc.

$\displaystyle B^{(k)}_2 \cdot \tilde{V}_2 = \begin{bmatrix}\times & \times & 0 ...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$ (15.157)

In the end, the matrix $ \tilde{U}^H B \tilde{V}$ becomes bidiagonal again. However, because of a special choice of $ \tilde{V}_1$ (QR algorithm), its non-diagonal elements are smaller than those of $ B$.

Please note that each of the transforms must also be applied to the unfactored $ U_B^H$ and $ V_B$ matrices which turns them successively into $ U^H$ and $ V$

Computation of the Wilkinson shift.

For a single QR step the computation of the eigenvalue $ \mu$ of the trailing 2-by-2 submatrix of $ T_n = B_n^T\cdot B_n$ that is closer to the $ t_{22}$ matrix element is required.

$\displaystyle T_n = \begin{bmatrix}t_{11} & t_{12}\\ t_{21} & t_{22}\\ \end{bmatrix} = B_n^T\cdot B_n$ $\displaystyle = \begin{bmatrix}\delta_{n-1} & \beta_{n-1} & 0\\ 0 & \delta_{n} ...
...rix}\delta_{n-1} & 0\\ \beta_{n-1} & \delta_{n}\\ 0 & \beta_{n}\\ \end{bmatrix}$ (15.158)
  $\displaystyle = \begin{bmatrix}\beta_{n-1}^2 + \delta_{n-1}^2 & \delta_{n}\cdot...
...n-1}\\ \delta_{n}\cdot \beta_{n-1} & \beta_{n}^2 + \delta_{n}^2\\ \end{bmatrix}$ (15.159)

The required eigenvalue is called Wilkinson shift, see eq.(15.140) for details. The sign for the eigenvalue is chosen such that it is closer to $ t_{22}$.

$\displaystyle \mu$ $\displaystyle = t_{22} + d - sign(d)\cdot \sqrt{d^2 + t_{12}^2}$ (15.160)
  $\displaystyle = t_{22} + d - t_{12}\cdot sign\left(\dfrac{d}{t_{12}}\right)\cdot \sqrt{\left(\dfrac{d}{t_{12}}\right)^2 + 1}$ (15.161)
  $\displaystyle = t_{22} - \dfrac{t_{12}^2}{d + t_{12}\cdot sign\left(\dfrac{d}{t_{12}}\right)\cdot \sqrt{\left(\dfrac{d}{t_{12}}\right)^2 + 1}}$ (15.162)


$\displaystyle d = \dfrac{t_{11} - t_{22}}{2}$ (15.163)

The Givens rotation $ \tilde{V}_1$ is chosen such that

$\displaystyle \begin{bmatrix}c_1 & s_1\\ -s_1 & c_1\\ \end{bmatrix}^T \cdot \be...
...) / \beta_1 \\ \delta_2 \end{bmatrix} = \begin{bmatrix}\times\\ 0 \end{bmatrix}$ (15.164)

The special choice of this first rotation in the single QR step ensures that the super-diagonal matrix entries get smaller. Typically, after a few of these QR steps, the super-diagonal entry $ \delta_n$ becomes negligible.

Zeros on the diagonal or super-diagonal.

The QR iteration described above claims to hold if the underlying bidiagonal matrix is unreduced, i.e. has no zeros neither on the diagonal nor on the super-diagonal.

When there is a zero along the diagonal, then premultiplication by a sequence of Givens transformations can zero the right-hand super-diagonal entry as well. The inverse rotations must be applied to the $ U_B^H$ matrix.

$\displaystyle B =$ $\displaystyle \begin{bmatrix}\times & \times & 0 & 0 & 0 & 0\\ 0 & \times & \ti...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$ $\displaystyle \rightarrow$ $\displaystyle \begin{bmatrix}\times & \times & 0 & 0 & 0 & 0\\ 0 & \times & \ti...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$    
$\displaystyle \rightarrow$ $\displaystyle \begin{bmatrix}\times & \times & 0 & 0 & 0 & 0\\ 0 & \times & \ti...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$ $\displaystyle \rightarrow$ $\displaystyle \begin{bmatrix}\times & \times & 0 & 0 & 0 & 0\\ 0 & \times & \ti...
...\\ 0 & 0 & 0 & 0 & \times & \times\\ 0 & 0 & 0 & 0 & 0 & \times\\ \end{bmatrix}$    

Thus the problem can be decoupled into two smaller matrices $ B_1$ and $ B_2$. The diagonal matrix $ B_3$ is successively getting larger for each super-diagonal entry being neglected after the QR iterations.

$\displaystyle \begin{bmatrix}B_1 & 0 & 0\\ 0 & B_2 & 0\\ 0 & 0 & B_3\\ \end{bmatrix}$ (15.165)

Matrix $ B_2$ has non-zero super-diagonal entries. If there is any zero diagonal entry in $ B_2$, then the super-diagonal entry can be annihilated as just described. Otherwise the QR iteration algorithm can be applied to $ B_2$.

When there are only $ B_3$ matrix entries left (diagonal entries only) the algorithm is finished, then the $ B$ matrix has been transformed into the singular value matrix $ \Sigma$.

Step 4: Solving the equation system

It is straight-forward to solve a given equation system once having the singular value decomposition computed.

$\displaystyle A\cdot x$ $\displaystyle = z$ (15.166)
$\displaystyle U\Sigma V^H\cdot x$ $\displaystyle = z$ (15.167)
$\displaystyle \Sigma V^H\cdot x$ $\displaystyle = U^H\cdot z$ (15.168)
$\displaystyle V^H\cdot x$ $\displaystyle = \Sigma^{-1} U^H \cdot z$ (15.169)
$\displaystyle x$ $\displaystyle = V\Sigma^{-1} U^H \cdot z$ (15.170)

The inverse of the diagonal matrix $ \Sigma$ yields

$\displaystyle \Sigma^{-1} = \begin{bmatrix}1/\sigma_1 & 0 & \cdots & 0\\ 0 & 1/...
...\ \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & \cdots & 1/\sigma_n \end{bmatrix}$ (15.171)

With $ v_i$ being the i-th row of the matrix $ V$, $ u_i$ the i-th column of the matrix $ U$ and $ \sigma_i$ the i-th singular value eq. (15.170) can be rewritten to

$\displaystyle x = \sum_{i=1}^{n} \dfrac{u_i^H\cdot z}{\sigma_i}\cdot v_i$ (15.172)

It must be mentioned that very small singular values $ \sigma_i$ corrupt the complete result. Such values indicate (nearly) singular (ill-conditioned) matrices $ A$. In such cases, the solution vector $ x$ obtained by zeroing the small $ \sigma_i$'s and then using equation (15.170) is better than direct-method solutions (such as LU decomposition or Gaussian elimination) and the SVD solution where the small $ \sigma_i$'s are left non-zero. It may seem paradoxical that this can be so, since zeroing a singular value corresponds to throwing away one linear combination of the set of equations that is going to be solved. The resolution of the paradox is that a combination of equations that is so corrupted by roundoff error is thrown away precisely as to be at best useless; usually it is worse than useless since it "pulls" the solution vector way off towards infinity along some direction that is almost a nullspace vector.

Jacobi method

This method quite simply involves rearranging each equation to make each variable a function of the other variables. Then make an initial guess for each solution and iterate. For this method it is necessary to ensure that all the diagonal matrix elements $ a_{ii}$ are non-zero. This is given for the nodal analysis and almostly given for the modified nodal analysis. If the linear equation system is solvable this can always be achieved by rows substitutions.

The algorithm for performing the iteration step $ k + 1$ writes as follows.

$\displaystyle x_{i}^{(k+1)} = \dfrac{1}{a_{ii}}\left(z_i - \sum_{j=1}^{i-1} a_{...
...m_{j=i+1}^{n} a_{ij}x_{j}^{(k)}\right) \;\;\;\; \textrm{ for } i = 1, \ldots, n$ (15.173)

This has to repeated until the new solution vectors $ x^{(k+1)}$ deviation from the previous one $ x^{(k)}$ is sufficiently small.

The initial guess has no effect on whether the iterative method converges or not, but with a good initial guess (as possibly given in consecutive Newton-Raphson iterations) it converges faster (if it converges). To ensure convergence the condition

$\displaystyle \sum_{j = 1, j \ne i}^{n} \left\vert a_{ij}\right\vert \le \left\vert a_{ii}\right\vert \;\;\;\; \textrm{ for } i = 1, \ldots, n$ (15.174)

and at least one case

$\displaystyle \sum_{i = 1, i \ne j}^{n} \left\vert a_{ij}\right\vert \le \left\vert a_{ii}\right\vert$ (15.175)

must apply. If these conditions are not met, the iterative equations may still converge. If these conditions are met the iterative equations will definitely converge.

Another simple approach to a convergence criteria for iterative algorithms is the Schmidt and v. Mises criteria.

$\displaystyle \sqrt{\sum_{i = 1}^n \sum_{j = 1, j \ne i}^n \left\vert\dfrac{a_{ij}}{a_{ii}}\right\vert^2} < 1$ (15.176)

Gauss-Seidel method

The Gauss-Seidel algorithm is a modification of the Jacobi method. It uses the previously computed values in the solution vector of the same iteration step. That is why this iterative method is expected to converge faster than the Jacobi method.

The slightly modified algorithm for performing the $ k + 1$ iteration step writes as follows.

$\displaystyle x_{i}^{(k+1)} = \dfrac{1}{a_{ii}}\left(z_i - \sum_{j=1}^{i-1} a_{...
...m_{j=i+1}^{n} a_{ij}x_{j}^{(k)}\right) \;\;\;\; \textrm{ for } i = 1, \ldots, n$ (15.177)

The remarks about the initial guess $ x^{(0)}$ as well as the convergence criteria noted in the section about the Jacobi method apply to the Gauss-Seidel algorithm as well.

A comparison

There are direct and iterative methods (algorithms) for solving linear equation systems. Equation systems with large and sparse matrices should rather be solved with iterative methods.

programming effort
computing complexity
Laplace expansion
numerical errors
straight forward
$ n!$
very time consuming
Gaussian elimination
numerical errors
$ n^3/3 + n^2/2$
numerical errors
$ n^3/3 + n^2 - n/3$
computes the inverse besides
LU decomposition
numerical errors
$ n^3/3 + n^2 - n/3$
useful for consecutive solutions

QR decomposition
$ 2n^3 + 3n^3$
Singular value decomposition
very high
$ 2n^3 + 4n^3$
ill-conditioned matrices can be handled

very good
diagonally dominant systems
$ n^2$ in each iteration step
possibly no convergence
very good
diagonally dominant systems
$ n^2$ in each iteration step
possibly no convergence

This document was generated by Stefan Jahn on 2007-12-30 using latex2html.