Choice 1, (5)
=+Choice2.
In practice we took ,U= 2 for computing sk +* since it leads to Choice 2 which needs less computations. With this strategy we hope to make an economy in the computation of sk+2. Proposition
Proof.
5.1. When sk + 2
Since rk*+z Bs,* = 0 and by (4) we have: P:+,Brk+I
=rk*+~B*Pk+, =v4rz+2B*
= r,*,,B*b,%
+&%+I)
(pkvlsk)i
=
rk*+2B*b4Sk+1)
= ;rz+2B*p,.
Cl
i 5.2. Computation
of Ai
pzBq,=
u,p~Br,=u,p,*r,+,
Pk*+ ,Bq,
= UI
= u,(u,~~+v~o~+,)
7.f.
We have
Pk*+, Br,
=
UI pk*+
1 ‘k
+, =
UI
(
[email protected]
+
u4wk
+ 1)
=
ddd~
M. Kheiifi
/ Lanczos
maximal
187
algorithm
And in the same way we obtain P&i B qk+l
(a;
=
 6’).
Therefore, the matrices B,, c,
(;  T?d)/(k
 8) + $q;+lR*pk.
A may be computed with only one supplementary inner product
qk*+lB*pke
6. The Lanczos maximal algorithm Let us define the matrices P, and Qj by:
Q,=qk, ei=
if step i is single,
‘i=Pkr
(qk, qk+i)?
P;= (Pkr Pk+i)r
if step i is double.
Moreover, we set Rj= (rk, rk+l) and .S, =(.sk, sk+i) at step i. Since step i may be single or double we note that i is not the dimension of the matrix T produced at step i. 6. I. Step i of LA4 We know: p,_,,
Qj_, (both are null if i = 1);
Z,, 2
x
1 matrix if step (i  1) is double (1
rk, Sk, w = wk = sk*rk, 11rk
(1)
(4
rk+l
=Brk
Sk+1
=
8
=
113 11 ‘k
11.
Q,,Z,W, 4 = bk?
B*sk

wpkl,
‘%
sk*rk+l =sk*+,rk,
=
1 matrix if not), Z, = 1;
X
bk,
‘k+l)?
sk+l)e
3 = sk*+lrk+,r
compute the inner products rk*rk+,, szsk+,, (3)
+l=w/(IIrkIIo
(4
u
=
IbkIIbcosbk9
chk

~~=u*~/II~I12, x
=
(
11 u
II 2 11 rk
‘kb
v =
eSktl,
x2
II 
II ‘k+l
IIs~+~ 11, IIrk+, II.
OS,
=

WSk+,,
‘frk+l/ll
rk
11 ‘7
II 2 II v II 2)/2(rk*rk
1
u*v11
It u 112
rk
11 2)’
use Choice 3.4 of 2. 1wo  e211 11rk II * IIu  lv II ’
1oc;,
e2 1
(5)
+I =
(6) (7)
If I $I I < To1 and +2 < Tol, then STOP with error message. If 1+I I 2 2  +2r then take a single step, otherwise take a double step.
‘2 =
II v 11’ Ii rk+l
+2=fin{ 
grk
II
’
lh I. l\t,l>.
188
M. Khelifi
(8) (a) Factorization
of W
/ Lanczos
maximal
algorilhm
Single step
Double
L%= Ilq/G7
A=diag(IIrkII~,
Yk =
step
IIrk+I~rll~)
1 u', 1 2 A' 0 [I
w/P,,
A
1
(b) Compute
(c) Compute
P, and Q,
r, and B,
qk =
Q,= RJ
‘k/&c
s,v *
Pk = Sk/Yk
P, =
r, = Z,Y,
r, = Z,[w,
B,=[?] (d) Compute vectors
the new ‘k+l
=
Zw]A'
19or
21
[;
[; 1
fk+z=(Bf?,Q,,r:)
rk+l/Pk
(B*P,LW[;]. sk+2
=
i (e) Compute
A,
(f) Biorthogonalization
Qk =
w/t?
‘k+l
+‘k+l
B*f’,
;
[
1
akqk
‘k+Z=
I “2
I > 2 I ~4
I
otherwise
,
Form A, by Computation
if
5.2
Q,A,[;]
‘k+Z
1. if I+l>21u41 1, (g) Inner products step i + 1
for
Compute 11‘k iisk+, IIsk+l
iI +
Z, + 1
Z ,+, = PI
+ 2 II
I122akS:Sk+l+a:IIskI12 yk
1 (h) Compute
otherwise
Z ,+I
[ u4/u2 u2/“4
[
1
1 1 ’
’
if
I u2 I > 2 I u4 I
otherwise
4.2. Total cost of step i Let us call VOP (vectorial operation), a vectorial addition or a scalarvector product. With this notation an inner product between two nvectors is equivalent to two VOPs because
189
M. Khelifi / Lanczos maximal algorithm + ** + unu,, (n multiplications single, then the total cost for step i is: u*u=141ul
LA L algorithm: 4 matrixvector LA4 algorithm: 4 matrixvector 4 matrixvector
and (n  1) additions).
Suppose that step i  1 is
products and 44 VOPs; products and 46 VOPs, if Iv2 1 > 2 (v4 1, products and 44 VOPs, if 1vz 1 < 2 1v, I ;
because we have v*
ul [ u2
=
I
u3 04
and usually none of u, is null. Remarks. (1) The matrixvector product has been counted separately because ts cost depends on the number of nonzero elements of the matrix B and not on the LAL or LM algorithm. (The same subroutine for a matrixvector product is used in the two algorithms.) When two single steps are made successively the two algorithms LAL and LM require 4 (2) matrixvector products and 48 VOPs. (3) The cost for one double step by the LM algorithm is slightly superior to that of LAL. But since LM makes a maximal factorization (the factorization of LAL is a particular case of it) it makes the value of min{cos(pj,
4;)9 cos(Pj+~~
4,+1)}
the largest possible. Therefore LM will require more often double steps than LAL and generally we will have a number of operations equal to or less than with the LAL algorithm.
7. Eigenelement
approximations
At each step i, the LM algorithm produces left and right Lanczos vectors P,, Q, which are n x 1 or n x 2; We can write &, = (Q,,.. ., Q,) and p, = (PI ,..., P,) with dimension of F1= dimension of Qi, i = 1,. . . , j. Then we can generalize the relations (1) for the LM algorithm. Proposition 7.1. In the LM algorithm, Bb, = OjT,+
if we note k + 1 = order( ij) = order( Qj), then
rk+2eAl,
q,*
+
elc+1e+23
qFj*
+
(e,,
if step j is single,
i;*B = i
e,+,>Z,sk*+2,
if step j is double.
190
Proof.
hf. Khelifi
See 13, Proposition
Therefore as follows:
the computable
Proposition
7.2,. Let
/ Lanczos maximal
algorithm
111.7.21 0 error estimates
(p, u,~u*)
of Ritz eigenelements
in the LM algorithm
be an
Proof.
As we prove the first and the second equalities us consider the case when step j is double. We have: B*FJu = pJ*T,*u +sk+2ZJ*(ek,
as in the standard
EI
Remark.
algorithm,
we can obtain
and
[sl, B*.sr ,.__, B*“lsl]
As in the standard it explicitly.
8. Connection
Lanczos
with Amoldi’s
If the two Krylov
Lanczos
a residue
approximation
Proposition
B”‘r,]
S=
(6) algorithm
to obtain
4,): 1
8.1. With the notation of (6), if W = S*R and Arnoldi’s method give the same result.
Proof. Since span( ql, q2,. . . , qJ) = span( rr, Br,, that there exists a factorization such that pJ = GramSchmidt process, there exists an upper orthogonal matrix, thus Q*Q = 1,. But we have have P = Q.
9. Numerical
without
method
are nonsingular, we can use a natural generalization of the Lanczos maximal a maximal factorization of the n X n matrix W= S*R which maximizes
algorithm
let
matrices
R = [rl, Br, ,...,
min{cos(p,,
algorithm,
ek+l)*o_
Since q.*u = PU we have the result.
calculating
becomes
is nonsingular,
then the Lanczos
maximal
. . . , BJ‘r,) for all j < n, it is sufficient to show qJ for all j < n. Since R is nonsingular, by the triangular matrix U such that Q = RU’ is an P*Q = I, with P = S( WU‘) and therefore we
examples
We give three examples which compare, for the first one, the performance algorithms and, for the others, the LM and the standard Lanczos algorithms
of the LAL and LM for large unsymmet
191
M. Khelifi / L_unczo.smaximal algorithm Table
1
Significant
eigenvalues
of matrix
B
LAL algorithms
LM algorithm
 0,1243449787580175D13
O.l776356839400250D14
0.5000000000004762D
+ 01
0.4999999999999988D
+ 01
0.6667000074960642D
+ 01
0.6666000000000784D
+ 01
0.7500013762325628D
+ 01
0.7499999987407891D
+ 01
0.7999476587787392D
+ 01
0.8000054937628480D
+ 01
ric matrices. All results were obtained (mantissa of 52 digits). Example 9.1 (LAL eigenvalues:
and
Lit4 algorithms).
h, = {approximation
on the IBM PC computer,
using double precision
B is a 100 x 100 upper triangular matrix with distinct
of lO(1  l/i)
with 4 digits},
i = 1,. . . ,100,
for example Xi = 0, X, = 5.0, X, = 6.667. The superdiagonal elements were randomly chosen numbers in [  10, + lo]. ri = S, = [I, 1,. . ., l]* and the number of steps taken is 20. A few of twenty eigenvalues computed are complex but in Table 1 we give the five significant eigenvalues which are real. We see that the LM algorithm gives one to five digits more than the LAL algorithm. Moreover we have: min
cos( p,, qi) = O.llD02
with LAL,
rnin
cos( p,,
with LM.
1
q,) =
0.67D02
Example 9.2 [7]. the matrix B represents the transition matrix of the Markov chain describing a random walk on an (k + 1) x (k + 1) triangular grid. The probability of jumping from the node (i, j) to either of the nodes (i  1, j) or (i, j  1) is given by: pd(i, j) = i(i
+j)/k.
This probability is being doubled if either of i or j is 0. The probability of jumping from the node (i, j) to either of the nodes (i + 1, j) or (i, j + 1) is given by: pu(i, j) = +  +(i +j)/k. (This transition can occur only for i + j c k.) The nodes are numbered in the order (0, 0), (1, 0), (2, 0), . . . , (k, 0), (0, l), (1, l), . . . and the dimension of B is $( k + l)( k + 2). We are interested in the computation of the left eigenvector corresponding to the eigenvalue unity because it represents the steady state probabilities of the chain. The LM and LAL algorithms are used with a technique of partial reorthogonalization of the Lanczos vectors described in [3]. The dimension of the matrix B is 496 (k = 30), the maximum step before restart is 50 and the stopping criterion is that the residual norms of the left and right eigenvectors (computed by Proposition 7.2) is less than lo’ (see Table 2): r,=s,=(]sin(2i+l)]),,
l
192
M. Khelifi / L.anczos maximal algorithm
Table 2 Number LANC LM
of steps
82 85
Total cost (VOPS)
max. left residual norm
Max. right residual norm
5346 5254
0.98~10~’ 0.14X10~5
0.23 x lo 5 0.25 x 1O6
Total cost (VOPS)
Max. left residual norm
Max. right residual norm
3452 3302
0.36x 10m6 0.62 x 1O6
0.31 x 106 0.41 x 106
Table 3 Number LANC LM
of steps
59 55
Although LM requires more steps, its total cost is less. The explanation Lanczos algorithm LANC requires more orthogonalizations than LM.
is that the standard
Example 9.3 [3]. Consider the matrix arising from the discretization by centered differences of the elliptic partial differential operator on a uniform n x n grid, with h = l/85
on the unit square with T(x, y) = (x  3,~)~ eXwy and u(x, r) = 0 on the boundary. Taking n = 84 yields a matrix B of dimension 7056. We computed the five greatest eigenvalues of B by the LANC and LM algorithms and obtained the following results (see Table 3): X, = 0.110107D
+ 04,
X2 = 0.954296D
+ 03,
X, =  0.842304D
+ 03,
h, =  0.833378D
+ 03.
r, = s1 = (sin(2i + l))i,
h, = 0.944085D
+ 03,
1 < i < 7056.
10. Conclusions In comparison with the LAL algorithm, the LM algorithm is not more complicated and it presents the advantage to have the best possible value for min{ cos( pi, q,): 1 < i
M. Khelifi / Lanczos maximal algorithm
193
References [l] G.H. Golub and C.F. van Loan, Matrix Computations (North Oxford Academic, Oxford, 1983). [2] W. Kahan, B.N. Parlett and E. Jiang, Residual bound on approximate eigensystems of nonnormal matrices, Siam J. Numer. Anal. 19 (1982) 470484. [3] M. Khelifi, Algorithme de Lanczos pour le calcul des valeurs et vecteurs propres de matrices non symetriques de grande taille, Thesis, University of Lille, France (1989). [4] R.S. Martin, G. Peters and J.H. Wilkinson, the QR algorithm for real Hessenberg matrices, in: Handbook for Automatic Computation 2 (Springer, New York, 1971). [5] B.N. Parlett, D.R. Taylor and Z.A. Liu, A lookahead Lanczos algorithm for unsymmetric matrices, Math. Comp. 44 (1985) 105124. [6] Y. Saad, Variations on Amoldi’s method for computing eigenelements of large unsymmetric matrices, Linear Algebra Appl. 34 (1980) 269295. [7] Y. Saad, Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems, Math. Camp. 42 (1984) 567588. [8] J.H. Wilkinson, The Algebraic Eigenuafue Problem (Clarendon Press, Oxford, 1965).