# Tensor ranks for the inversion of tensor-product binomials

## Tensor ranks for the inversion of tensor-product binomials

Journal of Computational and Applied Mathematics 234 (2010) 3170–3174 Contents lists available at ScienceDirect Journal of Computational and Applied...

Journal of Computational and Applied Mathematics 234 (2010) 3170–3174

Contents lists available at ScienceDirect

Journal of Computational and Applied Mathematics journal homepage: www.elsevier.com/locate/cam

Tensor ranks for the inversion of tensor-product binomials Eugene Tyrtyshnikov Institute of Numerical Mathematics, Russian Academy of Sciences, Gubkin Street, 8, Moscow 119333, Russian Federation

article

info

Article history: Received 8 February 2008 Dedicated to the memory of Gene Golub

abstract The main result reads: if a nonsingular matrix A of order n = pq is a tensor-product binomial with two factors then the tensor rank of A−1 is bounded from above by min{p, q}. √ The estimate is sharp, and in the worst case it amounts to n. © 2010 Elsevier B.V. All rights reserved.

MSC: 15A12 65F10 65F15 Keywords: Tensor ranks Kronecker product Low-rank matrices Inverse matrices Multilevel matrices Toeplitz matrices

1. Introduction Tensor decompositions are becoming a subject accumulating research interests in several important fields. Among them one should mention complexity theory for polynomial and matrix computations [1,2], data analysis supported by the Kruskal uniqueness property and Tucker reduction [3,2] and numerous multidimensional applications (see [4]). Despite an ever-increasing interest and several significant results (for some bibliography see [3,5,2,6,4]), many natural questions about seemingly ‘‘simple’’ cases are still not answered. In this paper we present some nontrivial tensor rank estimates for the inverse matrices. Consider a matrix A of order n = pq and its tensor-product decomposition A=

r X

Us ⊗ Vs

(1)

s=1

with minimal possible number of terms. In this case we call r a tensor rank of A, and write r = tRank(A). Matrices Us and Vs are of order p and q, respectively, and U ⊗ V means the Kronecker (tensor) product of matrices U and V . By definition, if U = [uij ] is of order p and V = [vkl ] is of order q, then U ⊗ V is a matrix of order n = pq with the following block structure: " # u11 V . . . u1p V ... ... . U ⊗ V = ... up1 V . . . upp V E-mail address: [email protected] 0377-0427/\$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.cam.2010.02.006

E. Tyrtyshnikov / Journal of Computational and Applied Mathematics 234 (2010) 3170–3174

3171

Of course, the tensor rank of A is calculated for some fixed p and q whose product is the order of A, so maybe a better way of notation would be tRank(A) = tRankp,q (A), but here we still skip p and q for the sake of brevity. It is obvious that a pair of indices (i, k) with 1 ≤ i ≤ p, 1 ≤ k ≤ q naturally points to a row in A (and in U ⊗ V ), and similarly, another pair (j, l) indicates a column in A. It follows that

(A)(i,k),(j,l) =

r X (Us )ij (Vs )kl .

(2)

s=1

This suggests reshaping A into a matrix B = reshape(A) in the following way:

(B)(i,j),(l,l) = (A)(i,k),(j,l) ,

(3)

the sizes of A and B being (pq) × (pq) and p2 × q2 , respectively. Then, Eqs. (2) and (3) show that B is the sum of r columnby-row matrices. Consequently (see [7]), tRank(A) = rank(B).

(4)

We are interested in sharp estimates for the tensor rank of A−1 , provided that A is nonsingular. A trivial bound emanating from (4) reads tRank(A−1 ) ≤ n, and does not reveal any dependence on r, or on p and q. The results presented below assume that the entries of all matrices are complex numbers or belong to a subfield of complex numbers; whether they are valid for finite fields is an open question. 2. Main result Let n = pq; the tensor decompositions are considered for fixed p and q. Assume that A = U ⊗ V is nonsingular. Then both U and V are nonsingular and, as is easy to check,

(U ⊗ V )−1 = U −1 ⊗ V −1 . Therefore, if tRank(A) = 1, then tRank(A−1 ) = 1. All the cases with tRank(A) ≥ 2 are significantly more intricate. Theorem 2.1. Let a matrix A of order n = pq be nonsingular, and assume that tRank(A) = 2. Then tRank(A−1 ) ≤ min{p, q}. Proof. Under the premises of the theorem we can write A = U1 ⊗ V1 + U2 ⊗ V2 . Let us assume first that all the matrices U1 , V1 , U2 , V2 are nonsingular. Moreover, assume that all the eigenvalues of U1 U2−1 and V2 V1−1 are simple. Then both matrices can be diagonalized by similarity transformations: U1 U2−1 = X ΛX −1 ,

Λ = diag{λ1 , . . . , λp },

,

M = diag{µ1 , . . . , µq }.

−1

V2 V1

= Y MY

−1

Hence, A (U2−1 ⊗ V1−1 ) = (X ⊗ Y ) Z (X −1 ⊗ Y −1 ), where Z = Λ ⊗ I + I ⊗ M.

3172

E. Tyrtyshnikov / Journal of Computational and Applied Mathematics 234 (2010) 3170–3174

In these circumstances, it is easy to deduce that tRank(A−1 ) = tRank(Z −1 ) = rank(reshape(Z −1 )). To better understand the structure of reshape(Z −1 ), take p = 2 and q = 3. Then

Z

−1

 (λ + µ )−1 1 1 0   0  =  0 

0

(λ1 + µ2 )

0 0

 reshape(Z

−1

0

(λ1 + µ3 )−1

0 0 0

0 0 0

(λ1 + µ1 )−1 0 0

 )=

0 0

−1

(λ2 + µ1 )−1

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0

(λ2 + µ1 )−1

0 0

(λ2 + µ2 )−1

0 0 0 0

0 0 0

0

0 0

(λ2 + µ2 )−1

0 0

(λ1 + µ2 )−1

0 0 0

0 0 0 0 0

0 0 0 0

(λ1 + µ3 )

    ,  

(λ2 + µ3 )−1  −1

0 0

 .

(λ2 + µ3 )−1

Evidently, in this case, rank reshape(Z −1 ) = 2 = min{p, q}.



However, for arbitrary values of p and q, the reshaped inverse to Z has a similar structure: all the entries are equal to zero except those forming a p × q submatrix H of the form

 (λ1 + µ1 )−1 ... H = (λp + µ1 )−1

... ... ...

 (λ1 + µq )−1 . ... (λp + µq )−1

(5)

Thus, we obtain tRank(A−1 ) ≤ min{p, q} under certain restrictions imposed on U1 , V1 , U2 , V2 . If these restrictions are not fulfilled, then, given an arbitrary ε > 0, we can approximate the factors by matrices U1 (ε), V1 (ε), U2 (ε), V2 (ε) so that they satisfy the restrictions and lie in the ε -vicinity of the corresponding factors (say, in the spectral norm). Thus, A ≈ A(ε) = U1 (ε) ⊗ V1 (ε) + U2 (ε) ⊗ V2 (ε), and for all sufficiently small ε > 0 we have tRank(Z −1 (ε)) = rank H (ε) ≤ min{p, q}. It is well known that the limit matrix for a sequence of matrices whose rank does not exceed some bound cannot have rank greater than this bound. Hence, rank H ≤ min{p, q}. So we complete the proof by transition to the limit with ε → 0.



Corollary 2.1. Let A be a matrix of order n = pq and tRank(A) = 2. If A is nonsingular, then tRank(A−1 ) ≤

n.

Our proof highlights that the estimate of Theorem 2.1 is sharp. Indeed, any square submatrix in H is nonsingular. It follows that rank H = min{p, q}, so we have tRank(A−1 ) = min{p, q} so long as the above restrictions on the factors hold true.

E. Tyrtyshnikov / Journal of Computational and Applied Mathematics 234 (2010) 3170–3174

3173

3. An extension to more factors Assume that n = p1 . . . pm , and consider a tensor-product binomial with m factors (m-adic case), A = B1 ⊗ B2 ⊗ · · · ⊗ Bm + C1 ⊗ C2 ⊗ · · · ⊗ Cm , where Bs and Cs are of order ps , 1 ≤ s ≤ m. As previously, the minimal number of m-factor tensor-product binomials in representations of a matrix is called a tensor rank or, to stress the number of factors, an m-adic tensor rank of this matrix. We will use the same notation, tRank(A). If tRank(A) = 2, then Theorem 2.1 suggests a sharp estimate on tRank(A−1 ) in the case m = 2. By similar means, we can try to produce an estimate for m > 2. As before, assume additionally that matrices Bs , Cs and Bs Cs−1 have only simple eigenvalues. Denote by Ds the diagonal eigenvalue matrix obtained from Bs Cs−1 by the similarity transformation. As is readily seen, tRank(A−1 ) = tRank((I + D1 ⊗ · · · ⊗ Dm )−1 ). The diagonal matrix in the right-hand side can be viewed as an m-dimensional array (tensor) with mode sizes p1 , . . . , pm . The tensor rank of such an array cannot exceed the minimum among p1 . . . pm /ps (see [2]). Since it is equal to tRank(A−1 ), we have established the following result. Theorem 3.1. Let tensor ranks be defined for tensor-product representations with m factors of orders p1 , . . . , pm , and assume that A is a nonsingular matrix of order n = p1 . . . pm and tRank(A) = 2. Then, for any ε > 0, there exists a nonsingular matrix Aε with the following properties:

kA − Aε k2 ≤ ε, tRank(Aε ) = 2, 1 tRank(A− ε ) ≤ min n/ps . 1≤s≤m

In all cases 1 (m−1)/m tRank(A− . ε )≤n

It is well known that the m-adic case with m ≥ 3 is essentially different from the case m = 2 [3,2,6]. For example, when m ≥ 3 then a convergent sequence of matrices, each of tensor rank r, may have a limit matrix whose tensor rank is greater than r. As a corollary, in the case m ≥ 3, when pursuing the same scheme of proof as for m = 2, we are generally not allowed to perform a transition to the limit. 4. Discussion We should emphasize that the estimate tRank(A−1 ) ≤ n(m−1)/m

(6)

is valid in the case m = 2 and still under question for arbitrary nonsingular m-adic binomials with m ≥ 3. In the latter case, we maintain the same estimate only under some additional assumptions (which are fulfilled, anyway, with probability 1) or only for some approximations to the original matrix. A principal observation is that the trivial upper bound tRank(A−1 ) ≤ n is never reached, at least in the case m = 2. Some questions, naturally in addition to the considered ones, seem to be open. For instance, we are not aware of any nontrivial estimate on the rank of the inverse to a matrix of the form A = U1 ⊗ V1 + U2 ⊗ V2 + U3 ⊗ V3 . In a certain respect, our estimates may look quite pessimistic even in the case m = 2: with O(n) parameters entirely defining A we need O(n3/2 ) parameters in a tensor representation of A−1 . A major remark on this account is that tensor approximations to A−1 may exist involving much fewer parameters: for many interesting classes of matrices the number of the ε -approximation parameters turns out to behave as O(n logα n logβ ε −1 ) with some positive constants α and β [8,9]. Some refined estimates are available for more special and practically important cases [10]. Moreover, various classes of structured matrices can benefit from a complementary tensor structure: a good justifying example is recently given by two-level Toeplitz matrices [11–13]. The same type of remark concerns the Hilbert matrices H of the form (5) that appeared in our proof of Theorem 2.1. Let p = q and 0 < δ ≤ λi , µj ≤ 1. In this case H can be approximated by a matrix of rank r = r (ε, δ) with entry-wise accuracy ε > 0 so that r = O(log ε −1 log δ −1 ), and this estimate on r is asymptotically sharp [14]. Approximations of low tensor rank provide a ‘‘breaking complexity’’ tool. In applications, we are especially interested in rank estimates for some approximations rather than for the exact data. This makes the difference between the cases m = 2 and m ≥ 3 less important, at least from the practical point of view.

3174

E. Tyrtyshnikov / Journal of Computational and Applied Mathematics 234 (2010) 3170–3174

Acknowledgements The author was supported by RFBR grant 08-01-00115, RFBR/DFG grant 09-01-91332 and Priority Research Programme of Department of Mathematical Sciences of Russian Academy of Sciences. References [1] D. Bini, V. Pan, Matrix and Polynomial Computations, Birkhauser, Boston, 1994. [2] J.B. Kruskal, Three-way arrays: Rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics, Linear Algebra Appl. 18 (1977) 95–138. [3] L. De Lathauwer, B. De Moor, J. Vandewalle, A multilinear singular value decomposition, SIAM J. Matrix Anal. Appl. 21 (4) (2000) 1253–1278. [4] W. Hackbusch, B.N. Khoromskij, E.E. Tyrtyshnikov, Hierarchical Kronecker tensor-product approximations, J. Numer. Math. 13 (2005) 119–156. [5] T.G. Kolda, Orthogonal tensor decompositions, SIAM J. Matrix Anal. Appl. 23 (1) (2001) 243–255. [6] L.-H. Lim, V. de Silva, Tensor rank and the ill-posedness of the best low-rank approximation problem, SIAM J. Matrix Anal. Appl. 30 (3) (2008) 1084–1127. [7] C.F. Van Loan, N.P. Pitsianis, Approximation with Kronecker products, in: NATO Adv. Sci. Ser E Appl. Sci., vol. 232, Kluwer, Dordrecht, 1993, pp. 293–314. [8] E.E. Tyrtyshnikov, Tensor approximations of matrices generated by asymptotically smooth functions, Sb. Math. 194 (5–6) (2003) 941–954. [9] E.E. Tyrtyshnikov, Kronecker-product approximations for some function-related matrices, Linear Algebra Appl. 379 (2004) 423–437. [10] W. Hackbusch, B.N. Khoromskij, Low-rank Kronecker product approximation to multi-dimensional of nonlocal operators. Part I. Separable approximation of multi-variate functions, Computing 76 (2006) 177–202. [11] J.M. Ford, E.E. Tyrtyshnikov, Combining Kronecker product approximation with discrete wavelet transforms to solve dense, function-related systems, SIAM J. Sci. Comput. 25 (3) (2003) 961–981. [12] J. Kamm, J.G. Nagy, Optimal Kronecker product approximations of block Toeplitz matrices, SIAM J. Matrix Anal. Appl. 22 (1) (2000) 155–172. [13] V. Olshevsky, I. Oseledets, E. Tyrtyshnikov, Tensor properties of multilevel Toeplitz and related matrices, Linear Algebra Appl. 412 (2006) 1–21. [14] I.V. Oseledets, Lower estimates for separable approximations of the Hilbert kernel, Sb. Math. 198 (3) (2006) 137–144.