- Email: [email protected]

On block minimal residual methods R. Bouyouli a , K. Jbilou b , A. Messaoudi c , H. Sadok b,∗ a Facult´e des sciences Agdal, D´epartement de math´ematiques et d’informatique, Rabat, Maroc b Universit´e du Littoral, zone universitaire de la Mi-voix, batiment H. Poincar´e, 50 rue F. Buisson, BP 699, F-62280 Calais Cedex, France c Ecole Normale Sup´erieure Takaddoum, D´epartement d’Informatique, B.P. 5118, Av. Oued Akreuch, Takaddoum, Rabat, Maroc

Received 27 January 2006; accepted 13 April 2006

Abstract In the present work, we give some new results for block minimal residual methods when applied to multiple linear systems. Using the Schur complement, we develop new expressions for the approximation obtained, for the corresponding residual and for the Frobenius residual norm. These results could be used to derive new convergence properties for the block minimal residual methods. c 2006 Elsevier Ltd. All rights reserved. Keywords: Block Arnoldi; Krylov subspace; Minimal residual; Iterative methods; Nonsymmetric linear systems; Multiple right-hand sides; Schur complement

1. Introduction We consider the multiple linear system AX = B

(1.1)

where A is an n × n real and large matrix, B and X are n × s rectangular matrices with s n. For small matrices A, the problem (1.1) can be solved using direct methods by computing the LU decomposition of the matrix A. For large problems, many iterative methods have been proposed in the last few years. Among them are the block Krylov subspace methods such as the block Arnoldi and the block GMRES methods [14,17,19]. In the present work we are interested in block minimal residual methods that include the block GMRES. Here, we exploit just the structure of the block Krylov matrix and we ignore the algorithm that implements the method. Using some properties of the Schur complement, we give new expressions for the approximate solution and the corresponding residual. These results will be used to derive new convergence properties for block minimal residual methods without referring to any algorithm. The case s = 1 has been treated by many authors in recent years; see [11,15,16] and the references therein. Our aim is to generalize some of these results to the block case and this will be done partially in this work. Notation: The vector vec(X) denotes the vector of Rns obtained by stacking the columns of the n × s matrix X, det(Z ) is the determinant of the square matrix Z and tr(Z ) denotes the trace of Z . For any matrices X and ∗ Corresponding author.

E-mail addresses: [email protected] (K. Jbilou), [email protected] (H. Sadok). c 2006 Elsevier Ltd. All rights reserved. 0893-9659/$ - see front matter doi:10.1016/j.aml.2006.04.009

R. Bouyouli et al. / Applied Mathematics Letters 20 (2007) 284–289

285

Y of dimensions n × p and q × l respectively, the Kronecker product X ⊗ Y is the nq × pl matrix defined by X ⊗ Y = [X i, j Y ]. Finally, the notation C ≥ D means that the matrix C − D is positive semidefinite where C and D are symmetric matrices of the same dimension. 2. Some Schur complement and Kronecker product identities We first recall the definition of Schur complements [18] and give some of their properties. Definition 1. Let M be a matrix partitioned into four blocks: A B M= , C D where the submatrix D is assumed to be square and nonsingular. The Schur complement of D in M, denoted by (M/D), is defined by (M/D) = A − B D −1 C. If D is not a square matrix then a pseudo-Schur complement of D in M can still be defined [3,6,10]. Generalizations and properties of the Schur complements are found in [1,2,4,5,7–9,12,13]. Proposition 1. Let us assume that the submatrix D is nonsingular; then A B D C B A C D = D = D = C D B A D C A

D B

D .

If E is a matrix such that the product E A is well defined, then EA EB A B D =E D . C D C D The proofs of these propositions are easily derived from the definition of the Schur complement. The following propositions give expressions for the inverse of the matrix M and the trace of the Schur complement (M/D). Proposition 2 ([21, p. 165]). If the matrices M and D are square and nonsingular, then −(M/D)−1 B D −1 (M/D)−1 −1 . M = −D −1 C(M/D)−1 D −1 + D −1 C(M/D)−1 B D −1 Proposition 3. Assume that M ∈ Rn×n and that D is a nonsingular matrix in Rm×m . Then tr(A) (vec(B T ))T tr(M/D) = In−m ⊗ D vec(C) In−m ⊗ D tr(A) (vec(B T ))T = det det(In−m ⊗ D). vec(C) In−m ⊗ D Proof. The result of this proposition is directly obtained by using the fact that if A ∈ Rn×m , B ∈ Rm× p and C ∈ R p,n , then tr (ABC) = vec(AT )T (In ⊗ B)vec(C) = vec(C)T (In ⊗ B T )vec(AT ). So we have tr((M/D)) = tr(A − B D −1 C) = tr(A) − vec(B T )T (In−m ⊗ D)−1 vec (C) tr(A) vec (B T )T = In−m ⊗ D vec(C) In−m ⊗ D tr(A) vec(B T )T det vec(C) In−m ⊗ D = . det(In−m ⊗ D)

286

R. Bouyouli et al. / Applied Mathematics Letters 20 (2007) 284–289

3. Block minimal residual methods Let V be an n × s rectangular matrix and consider the block Krylov subspace Kk (A, V ) = span{V, A V, . . . , Ak−1 V } generated by the columns of the matrices V, AV, . . . , Ak−1 V . Note that Kk (A, V ) is a subspace of Rn corresponding to the sum of the s simple Krylov subspaces K j (A, v j ) where v j , j = 1, . . . , s, is the j th column of the matrix V . If the columns of an n × s matrix Z are in Kk (A, V ), then Z is a linear function of the matrix V : Z=

k

Ai−1 V αi

i=1

where αi ∈ Rs×s . We denote this linear operator as Z = Qk (A) ◦ V where Qk is the matrix-valued polynomial defined by Qk (t) =

k

t i−1 αi , t ∈ R.

i=1

Let X 0 be a given n × s matrix and let R0 = B − AX 0 be the corresponding residual. A block minimal residual (Bl-MR) method for solving (1.1) generates, at step k, the approximation X kM R such that X kM R − X 0 = Z k ∈ Kk (A, R0 )

(3.1)

MR Rk,i ⊥ AKk (A, R0 );

(3.2)

and i = 1, . . . , s

M R is the i -th column of the residual R M R = B − AX M R . where Rk,i k k

Let Vk be the block Krylov matrix defined by Vk = [R0 , A R0 , . . . , Ak−1 R0 ] and set Wk = A Vk . Then the relation (3.1) is equivalent to X kM R = X 0 + Vk Ωk

(3.3)

where Ωk = [ω1 , ω1 , . . . , ωk ]T with ωi ∈ Rs×s . The orthogonality relation (3.2) implies (WkT Wk ) Ωk = WkT R0 .

(3.4)

Note that as each column of the residual matrix RkM R is obtained by projecting orthogonally the corresponding column of R0 onto the block Krylov subspace A Kk (A, R0 ), we are dealing with a minimization problem and then

RkM R F =

min

Z ∈Kk ( A,R0 )

R0 − AZ F .

Let Pk be the matrix-valued polynomial Pk (t) = Is − Rk = R0 −

k

k i=1

t i ωi ; then using (3.4) the residual Rk is given as

Ai R0 ωi

i=1

= Pk (A) ◦ R0 . Assuming that the matrix WkT Wk is nonsingular and using (3.4), the matrix polynomial Pk (t) is expressed as Pk (t) = Is − [t Is

t 2 Is

...

t k Is ] (WkT Wk )−1 WkT R0 .

(3.5)

R. Bouyouli et al. / Applied Mathematics Letters 20 (2007) 284–289

Let us introduce the following block matrix: ⎛ | t Is ... Is | −− −− Mk (t) = ⎝ −− WkT R0 | WkT Wk

⎞ t k Is −−⎠ .

287

(3.6)

Then from (3.5) and (3.6), the matrix-valued polynomial Pk can be expressed as the following Schur complement: Pk (t) = (Mk (t)/WkT Wk ). The next result gives the approximate solution X kM R and the residual RkM R as Schur complements. Theorem 1. Let X kM R be the approximation obtained, at step k, by applying a block MR method to (1.1) and let Rk be the corresponding residual. If the matrix WkT Wk is nonsingular, then Vk −X 0 T X kM R = W (3.7) W k k WkT R0 WkT Wk and

RkM R

=

R0 WkT R0

Wk WkT Wk

WkT Wk

.

Proof. The results are directly derived from the relations (3.2)–(3.4).

(3.8)

Theorem 2. Let RkM R be the residual obtained, at step k, by applying a block MR method to (1.1). If the matrix WkT Wk is nonsingular, then T (RkM R )T RkM R = (Vk+1 Vk+1 / WkT Wk ).

(3.9)

Proof. Invoking the orthogonality relation (3.2), we obtain (RkM R )T RkM R = R0T RkM R . Then using (3.8) and Proposition 1, it follows that T R0 R0 R0T Wk T (RkM R )T RkM R = W . W k k WkT R0 WkT Wk Now, as Vk = [R0 (RkM R )T RkM R

Wk ], we get T R0 Vk+1 T Wk Wk = WkT Vk+1

T = Vk+1 Vk+1 WkT Wk .

We can state now the following result which generalizes an important result for MR methods when applied to a linear system with a single right-hand side (see [11,16]). T V MR Theorem 3. Assume that at step k the matrices WkT Wk and Vk+1 k+1 are nonsingular; then the residual Rk satisfies the following relation: T Vk+1 )−1 E 1 )−1 , (RkM R )T RkM R = (E 1T (Vk+1

where E 1 denotes the first s columns of the identity matrix I(k+1)s . Proof. The matrix Vk+1 can be decomposed as Vk+1 = [R0 , Wk ]. Then T R0 R0 R0T Wk T . Vk+1 Vk+1 = WkT R0 WkT Wk

(3.10)

288

R. Bouyouli et al. / Applied Mathematics Letters 20 (2007) 284–289

T V T T T Since the matrices Vk+1 k+1 and Wk Wk are nonsingular, the Schur complement (Vk+1 Vk+1 /Wk Wk ) is also nonsingular. Therefore, using Proposition 2, it follows that −1

T T E 1T (Vk+1 Vk+1 )−1 E 1 = Vk+1 Vk+1 WkT Wk .

Using the Kantorovich inequality [20], we obtain the next result: Theorem 4. At step k, the residual RkM R satisfies the following relation:

RkM R F 2χ(Vk+1 ) ≤ ≤ 1, 2

R0 F 1 + χ(Vk+1 ) where χ(Vk+1 ) is the condition number of the matrix Vk+1 . Proof. The initial residual R0 can be written as R0 = Vk+1 E 1 where E 1 denotes the first s columns of the identity T V matrix I(k+1)s . Then R0T R0 = E 1T Vk+1 k+1 E 1 . Therefore, using the Kantorovich inequality in the matrix case [20], we obtain 2 2χ(Vk+1 ) T T T −1 −1 R0T R0 . (3.11) R0 R0 ≥ (E 1 (Vk+1 Vk+1 ) E 1 ) ≥ 1 + χ(Vk+1 )2 Then, using the result of Theorem 3 and the relation (3.11), we get 2 2χ(Vk+1 ) R0T R0 ≥ (RkM R )T RkM R ≥ R0T R0 . 1 + χ(Vk+1 )2

(3.12)

Applying the trace function on each side of (3.12), we get

R0 F ≥ RkM R F ≥

2χ(Vk+1 )

R0 F . 1 + χ(Vk+1 )2

The result of the preceding theorem shows that there is no convergence as long as the block Krylov matrix Vk+1 is well conditioned. Next, we give an expression for the Frobenius norm for the residual RkM R . As for the case s = 1 [11,16], this new expression could be used to obtain convergence results for block minimal residual methods. Theorem 5. Assume that at step k the matrix WkT Wk is nonsingular; then we have vec (WkT R0 )T

R0 2F det vec (WkT R0 ) Is ⊗ WkT Wk 2

RkM R F = . det(WkT Wk )s

(3.13)

Proof. From the relation (3.8), we obtain (RkM R )T RkM R = (R0 − Wk (WkT Wk )−1 WkT R0 )T RkM R . Therefore, using the orthogonality relation (3.2), we get (RkM R )T RkM R = R0T RkM R = R0T R0 − R0T Wk (WkT Wk )−1 WkT R0 . This shows that the matrix (RkM R )T RkM R can be expressed as the following Schur complement: T R0 R0 R0T Wk T W W (RkM R )T RkM R = k . k WkT R0 WkT Wk

(3.14)

R. Bouyouli et al. / Applied Mathematics Letters 20 (2007) 284–289

289

Now, applying Proposition 3 to (3.14), it follows that vec(WkT R0 )T

R0 2F det vec(WkT R0 ) Is ⊗ WkT Wk 2

RkM R F = det(Is ⊗ WkT Wk )

R0 2F vec(WkT R0 )T det vec(WkT R0 ) Is ⊗ WkT Wk = , det(WkT Wk )s which shows the result.

4. Conclusion In this work, we presented some results for block minimal residual methods when applied to multiple linear systems. These results will be used to develop convergence properties similar to those obtained recently for the case s = 1. This is the subject of a forthcoming paper. References [1] T. Ando, Generalized Schur complements, Linear Algebra Appl. 27 (1979) 173–186. [2] C. Brezinski, Other manifestations of the Schur complement, Linear Algebra Appl. 111 (1988) 231–247. [3] C. Brezinski, M. Redivo Zaglia, A Schur complement approach to a general extrapolation algorithm, Linear Algebra Appl. 368 (2003) 279–301. [4] R.A. Brualdi, H. Schneider, Determinantal identities: Gauss, Schur, Cauchy, Sylvester, Kronecker, Jacobi, Binet, Laplace, Muir and Cayley, Linear Algebra Appl. 52-53 (1983) 769–791. [5] D. Carlson, What are Schur complements, anyway? Linear Algebra Appl. 74 (1986) 257–275. [6] D. Carlson, E.V. Haynsworth, T. Markham, A generalization of the Schur complement by means of the Moore–Penrose inverse, SIAM J. Appl. Math. 26 (1974) 169–175. [7] G. Corach, A. Maestripieri, D. Stojanoff, Generalized Schur complements and oblique projections, Linear Algebra Appl. 341 (2002) 259–272. [8] D.E. Crabtree, E.V. Haynsworth, An identity for the Schur complement of a matrix, Proc. Amer. Math. Soc. 22 (1969) 364–366. [9] R.W. Cottle, Manifestations of the Schur complement, Linear Algebra Appl. 8 (1974) 189–211. [10] G. Marsaglia, G.P.H. Styan, Equalities and inequalities for ranks of matrices, Linear Multilinear Algebra 2 (1974) 269–292. [11] I.C.F. Ipsen, Expressions and bounds for the residual in GMRES, BIT 40 (2000) 524–533. [12] A. Messaoudi, Matrix recursive projection and interpolation algorithms, Linear Algebra Appl. 202 (1994) 71–89. [13] D.V. Ouellette, Schur complements and statistics, Linear Algebra Appl. 36 (1981) 187–295. [14] Y. Saad, M.H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statis. Comput. 7 (1986) 856–869. [15] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS publishing, New York, 1995. [16] H. Sadok, Analysis of the convergence of the minimal and the orthogonal residual methods, Numer. Algorithms 40 (2005) 201–216. [17] M. Sadkane, Block Arnoldi and Davidson methods for unsymmetric large eigenvalue problems, Numer. Math. 64 (1993) 687–706. [18] I. Schur, Potenzreihn im Innern des Einheitskreises, J. Reine Angew. Math. 147 (1917) 205–232. [19] Vital, B. Vital, Etude de quelques m´ethodes de r´esolution de probl`emes lin´eaires de grande taille sur multiprocesseur, Ph.D. Thesis, Univ´ersit´e de Rennes, Rennes, France, 1990. [20] F. Zhang, Matrix Theory, Springer-Verlag, New York, 1999. [21] F. Zhang, The Schur Complement and its Applications, Springer, New York, 2005.

Copyright © 2023 COEK.INFO. All rights reserved.