Expectations for Nonreversible Markov Chains

Expectations for Nonreversible Markov Chains

JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS ARTICLE NO. 220, 585]596 Ž1998. AY975850 Expectations for Nonreversible Markov Chains I. H. Dinwo...

151KB Sizes 0 Downloads 59 Views

JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS ARTICLE NO.

220, 585]596 Ž1998.

AY975850

Expectations for Nonreversible Markov Chains I. H. Dinwoodie Department of Mathematics, Tulane Uni¨ ersity, New Orleans, Louisiana 70118 Submitted by N. H. Bingham Received September 27, 1996

Bounds are given for an irreducible Markov chain on the probability that the time average of a functional on the state space exceeds its stationary expectation, without assuming reversibility. The bounds are in terms of the singular values of the discrete generator. Q 1998 Academic Press

1. INTRODUCTION We are interested in formulating concrete inequalities on large deviation probabilities for nonreversible Markov chains. Recent work on nonreversible chains has concentrated on convergence of the law of the coordinate process X n to its stationary distribution ŽFill w10x, Meyn and Tweedie w16x, and Diaconis and Saloff-Coste w6x.. We are concerned here with time averages, which have applications in estimation problems in Markov Monte Carlo methods Žsee Anantharam w1x, Smith and Roberts w19x, and Besag and Green w2x.. For these applications, one needs nonasymptotic bounds involving computable quantities in order to evaluate the performance of the method Žsee Motwani and Raghavan w17x.. The state space of the chain will be a compact separable metric space S with Borel field B. Let P be a Markov transition kernel on S, and let m be the stationary distribution. Let f : S ª w0, 1x be continuous and let m f s H f d m. We will give bounds uniform in the starting point x on the probability of the deviation  H f dL n y m f G « 4 , where H f dL n is the time average Ž f Ž X 1 . q ??? qf Ž X n ..rn of the Markov chain x s X 1 , X 2 , . . . in S. The probability of such deviations tends to zero by the weak law of large numbers, assuming that P is irreducible. The papers by Dinwoodie w8x and Gillman w12x were concerned with random walks on undirected graphs, or reversible chains, and the bounds involved only the spectrum of the 585 0022-247Xr98 $25.00 Copyright Q 1998 by Academic Press All rights of reproduction in any form reserved.

586

I. H. DINWOODIE

transition matrix P, which can often be estimated. Techniques for estimating the second eigenvalue can be found in Diaconis and Stroock w7x, Diaconis and Saloff-Coste w5x, and Lawler and Sokal w15x. We discuss this problem for certain examples in Section 4. The essential quantity in our inequalities is the smallest singular value b above zero for the generator D s P y I. If P is reversible, then b is the gap between 1 and the second largest eigenvalue. Our results say roughly 2 that Px H f dL n y m f G « 4 decays at least as fast as eyn b« for small « ) 0, so the rough conclusion is that on the order of 1rb« 2 steps are sufficient for accuracy « ) 0. A minorization condition Ž3.2. on the kernel P gives a computable bound uniform in the starting point in Corollary 3.1. When specialized to the reversible case, we have nearly doubled the exponent compared to Dinwoodie w8x to get the coefficient of 1 on b« 2 . The constants in front of the decay term are not directly comparable, since the present technique is more general. The technique we use for the necessary perturbation theory in Lemma 2.1 is adapted from Rellich w18x, but is applied to a nonnegative operator which is not necessarily self-adjoint. Other bounds of a different nature can be found in the literature. In the i.i.d. case, a bound on the probability is expŽyn lŽ « .., where l is the convex conjugate of the logarithm of the moment generating function for the random variable f Ž X i . y m f . The method of types Žsee Dembo and Zeitouni w4 p. 12x. can also be used to get bounds for Markov chains on a finite state space Žsee Csiszar, ´ Cover, and B. Choi w3x. in terms of the large deviation rate function. Our bounds hold on a compact space, and are not formulated in terms of the large deviation rate function but rather minimal spectral properties of the transition kernel. This means that the traditional large deviation theorems do not follow from our bounds, but the bounds can be used in practical problems where transition kernels are quite complicated and only estimates of the second largest eigenvalue are possible. Let P be a Markov kernel on the compact state space S. We assume the condition of irreducibility on P that for each open set U ; S and each x g S, there exists k G 1 such that P k Ž x, U . ) 0. We assume that the probability measure m on S is invariant for our Markov kernel P, which means that H m Ž dx . P¨ Ž x . s H m Ž dx . ¨ Ž x . for bounded measurable ¨ . We also assume that P: L2 Ž m . ª CŽ S . is a completely continuous operator. The irreducibility implies that for ¨ g CŽ S . with ¨ G 0 and not identically zero, there exists k G 1 and d ) 0 such that k

Ý P i Ž ¨ . G d ) 0. is1

Ž 1.1.

NONREVERSIBLE MARKOV CHAINS

587

From Ž1.1. a simple argument shows that the dimension is one of the eigenspace corresponding to the largest eigenvalue 1 of the restricted operator P: CŽ S . ª CŽ S .. Thus dim kerŽ I y P . s 1, and standard Fredholm theory says that dim kerŽ I y P T . s 1, where P T is the adjoint operator on measures. Thus there is a unique invariant measure, which is m. The relevant Perron]Frobenius theory for the operator P on CŽ S . can be found in Krein and Rutman w14, Sect. 6x, which generalizes the matrix theory in Gantmacher w11, Chap. 13x. For our purposes, the following points are relevant. Our operators and their perturbations may have a finite number of eigenvalues of the same magnitude as the spectral radius. However, one of these eigenvalues is real and positive, and is simple in the sense that it corresponds to the subspace of CŽ S . spanned by a single eigenfunction, which is positive and continuous. Basic continuity properties of the spectrum are established in Kato w13x. Define an inner product on CŽ S . by ² ¨ , w : s H ¨ w d m. Let 1H denote the subspace of functions in L2 Ž m . which are orthogonal to 1 for ², :. The norm notation 5 ? 5 will be for this inner product: 5 ¨ 5 2 s ² ¨ , ¨ :. Let wf s f y m f g 1H lCŽ S .. Now D s P y I: 1H lCŽ S . ª 1H lCŽ S ., and is both injective and surjective. First, D maps 1H into itself, since if ¨ g 1H , then ²1, D¨ : s ²1, P¨ : y ²1, ¨ : s 0, where we use the fact that m is invariant for P. If it were not injective, there would be two eigenfunctions corresponding to the value 1. To see that D is surjective, apply the standard Fredholm theory to yD defined on CŽ S .. The dual space M is the space of finite signed Borel measures on S which act by integration, and the adjoint P T : M ª M is given by P Tn Ž A. s H n Ž x . P Ž x, A.. Since kerŽyD . is spanned by 1, dim kerŽyD . s 1 s dim kerŽyDT . s 1 and so kerŽ DT . is spanned by the invariant measure m. Thus the range of yD defined on CŽ S . is  ¨ g CŽ S .: H ¨ d m s 04 s 1H lCŽ S .. But if DŽ ¨ . s w g 1H for ¨ g CŽ S ., then ¨ y ² ¨ , 1:1 g 1H lCŽ S . and DŽ ¨ y ² ¨ , 1:1. s DŽ ¨ . s w. So there is in fact an element in 1H lCŽ S . whose image is w, and D is surjective.

2. PERTURBATION THEORY The essential quantity b in the inequalities is the first singular value of D. More precisely, define b G 0 by

b 2 s inf  ² D¨ , D¨ : : ¨ g 1H lC Ž S . , 5 ¨ 5 s 1 4 .

Ž 2.1.

Let r s 1rb . Then for any w, ¨ g 1H lCŽ S ., ² Dy1 ¨ , Dy1 w : F r 2 5 ¨ 5 5 w 5. Let D* be the adjoint for D on 1H for the inner product ² , :. If S is finite, then P*Ž x, y . s m Ž y . P Ž y, x .rmŽ x . gives the reversed chain. Then

588

I. H. DINWOODIE

b satisfies b 2 G inf ² ¨ , D*D Ž ¨ .: : ¨ g 1H , 5 ¨ 5 s 1 4 , which shows that essentially b is the square root of the smallest eigenvalue of D*D restricted to 1H , or the first singular value of D, which would be the first eigenvalue above zero for D*D defined on L2 Ž m .. Clearly, b g w0, 2x, but in practical examples b will be small. If P is reversible, then P s P* and b s 1 y l2 , where l2 - 1 is the second largest eigenvalue in the real spectrum of P denoted 1 ) l 2 ) ??? ) l s G y1. Let D be the operator on CŽ S . given by DŽ h.Ž x . s wf Ž x . hŽ x .. Let A t for t G 0 be the perturbation of P given by A t s Ž I q tD q t 2 D 2 . P , which is a completely continuous operator on the Banach space CŽ S . with the uniform metric. LEMMA 2.1. Suppose b F 1. For t g w0, br8x, A t has largest eigen¨ alue a t s 1 q a2 t 2 q a3 t 3 q ??? and corresponding eigen¨ ector vt s 1 q tc 1 q t 2 c 2 q ??? g CŽ S ., where a n s ²1, D 2 Pc ny2 : q ²1, DPc ny1 :

Ž 2.2.

for n G 2, c 0 s 1, c 1 s yDy1 Ž wf ., and c n g 1H is gi¨ en by ny1

cn s

Ý ai Dy1 Ž c nyi . q Dy1 Ž an 1 y D 2 Pc ny2 y DPc ny1 . .

is2

Furthermore, <1 q a2 t 2 y a t < F 2 5 r 2 t 3 , 51 y vt 5 F 2 3 rt, and ²1, vt : s 1. Remark. It is easily computed that a2 s ² wf , yDy1 Ž wf .:. Proof. Consider the sequence Ž1 s 5c 0 5, 5c 1 5, 5c 2 5, . . . .. From Ž2.2., with n G 2, < an < F 5c n 5 F

1 2

ny1

Ý ai Dy1 Ž c nyi .

Ž 5c ny1 5 q 5c ny2 5 . ,

q Dy1 Ž a n 1 y D 2 Pc ny2 y DPc ny1 .

is2 ny1

s

¦Ý

is2

y1

ai D

ny1

Ž c nyi . ,

1r2 y1

;

Ý a j D Ž c nyj . js2

q Dy1 Ž Ł H 1 . ,

589

NONREVERSIBLE MARKOV CHAINS

where Ł H 1 indicates the projection of D 2 Pc ny2 q DPc ny1 onto the space orthogonal to 1, ny1 ny1

s

žÝ Ý

1r2

a i a j ² Dy1 c nyi , Dy1 Ž c nyj .:

is2 js2

q² Dy1 Ž Ł H 1 . , Dy1 Ž Ł H 1 .:

/

1r2

ny1

Fr

Ý < ai < 5c nyi 5 q r 5Ł H 1 5

is2 ny1

Fr

Ý < ai < 5c nyi 5 q 5c ny2 5 q 5c ny1 5

,

is2

since 5Ł H 1 5 F 5 D 2 Pc ny2 q DPc ny1 5 F 5c ny2 5 q 5c ny1 5 , Fr sr

1 2 1 2

ny1

Ý Ž 5c iy1 5 q 5c iy2 5. 5c nyi 5 q 5c ny2 5 s 5c ny1 5

is2 ny1

1

ny2

Ý 5c i 5 5c ny1yi 5 q 2 Ý 5c i 5 5c ny2yi 5

is1

is0

1 1 q 5c ny 2 5 q 5c ny1 5 2 2 s

1 2

r

ny1

ny2

is0

is0

Ý 5c i 5 5c ny1yi 5 q Ý 5c i 5 5c ny2yi 5 q 5c ny2 5

.

Let d 0 s 1, d1 , d 2 , . . . be the increasing Ž r G 1. sequence of numbers defined recursively by ny1

dn s r

Ý d j d ny1yj ,

n G 1.

js0

Then 5c n 5 F d n by induction, and the sequence  d n4 has generating function g Ž x . s d 0 q d1 x q ??? which satisfies g Ž x . s 1 q rxg 2 Ž x .. Then gŽ x. s

1 y '1 y 4 rx 2 rx `

s2 Ý is0

1r2 iq1

ž /Ž

s2

`

Ý ks1

i

1r2 k

ž /Ž i

y1 . Ž 4 r . x i .

y1 .

kq 1

Ž4r .

ky1

x ky1

590

I. H. DINWOODIE

.Ž . n Ž4 r . n s 2 <Ž n1r2 .<Ž4 r . n and g has radius of converThus d n s 2Ž n1r2 q 1 y1 q1 gence 1r4r, by the ratio test. In particular, this means that vt g L2 Ž m . for t - 1r4r and 51 y vt 5 F

`

`

1

1

Ý t nd n F Ý Ž 4 r .

n n

t s 4 rt Ž 1 y 4 rt .

y1

F 8 rt

if t F 1r8r s br8. Since < a n < F Ž5c ny1 5 q 5c ny2 5.r2 F Ž d ny1 q d ny2 .r2 F Ž4 r . ny 1 , it follows that <1 q a2 t 2 y a t < F

`

`

3

3

Ý < an < t n F Ý Ž 4 r . ny 1 t n s Ž 4 r . 2 t 3 Ž 1 y 4 rt . y1 F 2 5 r 2 t 3

for t F br8. It remains to show that for t g w0, br8x, the convergent series a t and vt are the desired eigenvalue and eigenvector. Now A t Ž vt . s Ž I q tD q t 2 D 2 . P Ž 1 q tc 1 q t 2 c 2 q ??? . s 1 q t Ž D1 q Pc 1 . q t 2 Ž D 2 1 q DPc 1 q Pc 2 . q t 3 Ž D 2 Pc 1 q DPc 2 q Pc 3 . q ??? s 1 q t Ž D1 q c 1 q Dc 1 . q t 2 Ž D 2 1 q DPc 1 q Dc 2 q c 2 . q ??? q t n Ž D 2 Pc ny 2 q DPc ny1 q Dc n q c n . q ??? s 1 q tc 1 q t 2 Ž D 2 1 q DPc 1 q a2 1 y D 2 1 y DPc 1 q c 2 . q ??? ny1

q t n D 2 Pc ny 2 q DPc ny1 q

ž

Ý ai c nyi q an 1 2

yD 2 Pc ny 2 y DPc ny1 q c n q ???

/

ny1

s 1 q tc 1 q t 2 Ž a2 1 q c 2 . q ??? qt n a n 1 q

ž

Ý ai c nyi q c n 2

s 1 q tc 1 q t 2 Ž c 2 q a1 c 1 q a2 1 . q t 3 Ž c 3 q a1 c 2 q a2 c 1 q a3 1 . q ??? s Ž 1 q a1 t q a2 t 2 q ??? .Ž 1 q tc 1 q t 2 c 2 q ??? . s a tvt .

/

q ???

NONREVERSIBLE MARKOV CHAINS

591

This proves that a t is an eigenvalue for A t corresponding to the eigenfunction vt g L2 Ž m .. Thus vt g CŽ S . since P maps L2 Ž m . into CŽ S .. Clearly ²1, vt : s 1. Finally we show that a t is in fact the spectral radius rt s lim nŽ5 A nt Ž1.5 ` .1r n for the operator A t on CŽ S . when t g w0, br8x. Recall that the spectral radius is a positive eigenvalue for the compact operator A t : CŽ S . ª CŽ S ., and corresponds to a one-dimensional eigenspace spanned by a positive continuous function. Let t 0 s sup t g w0, br8x: rt s a t 4 . Clearly t 0 ) 0, since a0 s r 0 s 1, and by the continuity of eigenvalues with respect to the parameter t, a small neighborhood of 1 will contain exactly one eigenvalue for sufficiently small t ) 0. Then this neighborhood will contain the perturbed eigenvalue a t and the spectral radius rt , so they must be the same. Also, a t 0 s rt 0 by continuity, since a t s rt for t g w0, t 0 .. But then there is a neighborhood about t 0 in which a0 is the largest eigenvalue, again by continuity and the fact that the eigenspace for the largest eigenvalue has dimension 1. Hence t 0 cannot be less than br8. LEMMA 2.2. Suppose b F 1 and let a t be the spectral radius of the operator A t . For t g w0, br8x, a t F 1 q Ž rr4 . t 2 q Ž 2 5 r 2 . t 3 . Proof. By the remark following Lemma 2.1, a2 s y² wf , Dy1 Ž wf .:. It then follows that a2 s ² wf , yDy1 Ž wf .: F 5 wf 5 5 Dy1 Ž wf .5 F 5 wf 5 2 r, by the variational characterization of the singular value r. But 5 wf 5 2 F 1r4, and hence a2 F rr4. Now for t g w0, br8x, a t F 1 q a2 t 2 q 2 5 t 3 r 3 F 1 q t 2 rr4 q t 3 2 5 r 3, from Lemma 2.1.

3. INEQUALITIES Let x s X 1 , X 2 , . . . be our Markov chain in S, with transition kernel P and invariant distribution m. In this section we use the technical results of Section 2 to obtain probability inequalities on the time average. In the remaining part of the paper, let « ) 0 and let t s 2 «b , and assume that b F 1. Then Lemma 2.2 implies that when « F 1r16, a t F 1 q « 2b q « 3 2 8 r 2b 3 F e « 5vt y 1 5 F 16 « .

2

bq « 3 2 8 b

Ž 3.1.

592

I. H. DINWOODIE

Let b be defined at Ž2.1.. For « - 2y8 ,

THEOREM 3.1.

inf x Px

½H

f dLn y m f G « F eyn b«

5

2

Ž1y « 2 8 .

.

Proof. Let Dt be the operator on CŽ S . given by Dt Ž h.Ž x . s e t w f Ž x . hŽ x ., and let Pt s Dt P. By the Markov inequality and an induction argument, Px

½H

f dLn y m f G «

5

F eyn t « E x exp Ž t Swf Ž X i . . s eyn t «d x Ptn Ž 1 . F eyn t «d x A nt Ž 1 . , where for the last inequality we use Dt Ž h. F Ž I q tD q t 2 D 2 .Ž h. for h G 0 when t F 1. Since inf x A nt Ž1. s min x A nt Ž1. s A nt Ž1.Ž x n, t . is the spectral radius for the operator B given by B Ž h.Ž x . s A nt Ž h.Ž x n, t ., it is no greater than the spectral radius a tn of A nt . Hence, using Ž3.1., inf x Px

½H

f dLn y m f G « F eyn t « inf x d x A nt Ž 1 .

5

F eyn t « a tn F exp Ž yn2 b« 2 . exp Ž n b« 2 q n « 3 2 8b . s exp Ž yn b« 2 Ž 1 y « 2 8 . . . Let an integer k G 1, and c ) 0 satisfy k

k

Ý P i Ž x, ?. G c Ý P i Ž y, ?. , is1

x, y g S.

Ž 3.2.

is1

To find quantities k and c in Ž3.2. for particular applications, one can apply results on the rate of convergence of the law of X n to its stationary distribution in total variation distance. If S is finite and P Ž x, z . G p ) 0, then one can take c s p. COROLLARY 3.1. sup x Px

½H

Assume Ž3.2.. Then for each n ) k, and 0 F « - 2y8 , f dLn y m f G « q krn F

5

1 c

eyn b«

2

Ž1y « 2 8 .

.

593

NONREVERSIBLE MARKOV CHAINS

Proof. Let x, y g S be arbitrary. Then for 1 F i F k and using 0 F f F 1, Px

½H

f dLn y m f G « q krn

5

nqiyk

F Px

½Ý

f Ž X j . G Ž m f q « q krn . n y k

jsiq1

5

nyk

s

s

i

H P Ž x, dx

iq1 . Px iq 1

½Ý

f Ž X j . G Ž m f q « q krn . n y k

js1

½

nyk

H P Ž x, dw . P Ý f Ž X . G Ž m q « . n i

w

j

f

js1

5

5

.

Therefore for any y g S, kPx

½H

f dLn y m f G « q krn k

F

nyk



P i Ž x, dw . Pw

is1

F

k

1

F s

½Ý ½Ý

f Ž Xj . G Ž mf q « . n

js1

w

1 c 1 c k c

k

f Ž Xj . G Ž mf q « . n

js1

nykqi

is1

½Ý

k

n

Ý Py

f Ž Xj . G Ž mf q « . n

jsiq1

½

Ý Py Ý f Ž X j . G Ž m f q « . n is1

Py

½H

5

nyk

i

H Ý P Ž y, dw . P c is1

s

5

js1

5

5

5

f dLn y m f G « .

5

Hence, sup x Px H f dLn y m f G « q krn4 F Ž1rc .inf y Py H f dLn y m f G « 4 , and the result follows from Theorem 3.1. When S is finite, it is easy to show that sup x Px H f dL n y m f G « 4 F 2 8 Ž1 q O Ž « .. eyn b« Ž1y « 2 . with condition Ž3.2.. We remark that a slightly more accurate expansion in Lemma 2.1 with A t s I q tD q t 2 D 2r2 q t 3 D 3r2 leads to a coefficient a2 in the expansion of the largest eigenvalue of y5 wf 5 2r2 q ² wf , yDy1 Ž wf .: F 5 wf 5 2 Ž by1 y

594

I. H. DINWOODIE

1r2. s s 2 Ž2 by1 y 1.r2, where s 2 is the variance 5 wf 5 2 . We could then get an improved second order term in the above bound of « 2brŽ2 s 2 Ž2 y b ... This has the advantage that when the process is i.i.d., b s 1 and the coefficient becomes the well known « 2r2 s 2 . For most interesting applications b is small, so brŽ2 s 2 Ž2 y b .. f br4s 2 f b . Thus our bound is simpler, but is essentially the same for interesting examples and the method is fundamentally precise.

4. EXAMPLES Heuristically, the theorems above show that on the order of 1rb« 2 steps are sufficient for precision « in the estimate of a probability m Ž A. by its time average. The bound is conservative, but gives insight into convergence properties not reflected in other results on rates of convergence. Consider the nonreversible transition matrix on Zr4Z, described by a deterministic walk clockwise. The spectrum is "1, and "i. The law m n of X n does not converge to the uniform stationary distribution, even though the time averages converge very fast. The multiplicative reversibilization, on which many bounds for convergence of m n are based, does not help since its spectral gap is zero. The additive reversibilization is a periodic symmetric random walk with spectrum  1, 0, 0, y14 . None of these objects bears in any useful way on the convergence of the time averages. The quantity b , the first singular value of D*D, turns out to be '2 Žcf. Example 4.2.. This implies fast convergence for the time averages. A class of processes where general and precise statements for b are possible is the class of Markov chains on abelian groups, as described below. Two well-known and useful examples are the circle ZrmZ and the cube Ž Zr2 Z . d. The singular value b can be found fairly easily, according to Proposition 4.1 below. If G is a group and g g G, then g can be identified with a transition matrix ˆ g on G defined by ˆ g Ž h, hg . s 1, or equivalently ˆ g Ž hgy1 , h. s 1 for each h g G. We consider Markov chains with transition matrices of the form Ps

Ý

pg ˆ g,

Ž 4.1.

ggG

where Ž pg . are nonnegative constants which sum to 1, and represent probabilities of using the corresponding group element to make the next step. As an operator on functions on G, one can see that ˆ g Ž Ih4 . s Ih gy1 4 , where I is the indicator function, and Ž Ih4 . ˆ g s Ih g4. Such chains may or may not be reversible, but the invariant distribution is always uniform if

NONREVERSIBLE MARKOV CHAINS

595

the chain is irreducible. Let s Ž P . denote the set of eigenvalues of P. If the group G is abelian, these can often be determined exactly by finding the irreducible characters of the group G. Recall that a group character x of an irreducible representation of an abelian group is in fact an eigenfunction for ˆ g with eigenvalue x Ž g .. This is because an irreducible representation is one-dimensional and therefore is its own character x , which implies that x Ž gh. s x Ž g . x Ž h.. This means that the right eigenvectors are the characters x with corresponding eigenvalues Ý g pg x Ž g ., whereas the left eigenvectors are the characters x with corresponding eigenvalues Ý g pg x Ž gy1 . s Ý g pg x Ž g .. PROPOSITION 4.1. Let P be an irreducible Marko¨ chain on the group G, of the form Ž4.1.. Then the smallest singular ¨ alue b of the generator D s P y I is gi¨ en by

b s min  <1 y l i < : l i g s Ž P . y  1 4 4 . Proof. The eigenvectors of D*D are the characters for the irreducible representations, with eigenvalues Ž1 y l i .Ž1 y l i. s <1 y l i < 2 : l i g s Ž P .4 . EXAMPLE 4.2. Consider an asymmetric random walk on the circle ZrmZ, with P Ž i, i q 1. s p ) 0 and P Ž i, i y 1. s 1 y p. Then P is not reversible unless p s 1r2. Its spectrum s Ž P . is the set s Ž P . s  pe 2 p i h r m q qey2 p i h r m : h s 0, . . . , m y 14 s  cosŽ2p hrm. q iŽ p y q .sinŽ2p hrm.: h s 0, . . . , m y 14 , which means that if a s Ž p y q . 2 , b 2 s min hŽ1 y cosŽ2p hrm.. 2 q a sin 2 Ž2p hrm.: h s 0, . . . , m y 14 s 1 q a y 2 cosŽ2prm. q Ž1 y a .cos 2 Ž2prm.. If p s 1, then b 2 s 2 y 2 cosŽ2prm., and b takes its minimum value, which corresponds to slowest convergence, at the symmetric walk with p s 1r2. Therefore, the reversible choice is the worst choice for convergence of time averages. A very interesting example of a nonreversible chain is the Gibbs sampler for a distribution on a product space when the coordinates are updated in a fixed order. Estimates on the quantity b , analogous to the reversible estimates of Dinwoodie w9x for a finite state space, are essential to understanding its convergence.

ACKNOWLEDGMENT We thank an anonymous referee for valuable comments.

596

I. H. DINWOODIE

REFERENCES 1. V. Anantharam, A large deviation approach to error exponents in source coding and hypothesis testing, IEEE Trans. Inform. Theory 36 Ž1990., 938]943. 2. J. Besag and P. J. Green, Spatial statistics and Bayesian computation, J. Roy. Statist. Soc. Ser. B 55 Ž1993., 25]37. 3. I. Csiszar, ´ T. M. Cover, and B. Choi, Conditional limit theorems under Markov conditioning, IEEE Trans. Inform. Theory 33 Ž1987., 788]801. 4. A. Dembo and O. Zeitouni, ‘‘Large Deviations Techniques and Applications,’’ Jones & Bartlett, Boston, 1993. 5. P. Diaconis and L. Saloff-Coste, Comparison theorems for reversible Markov chains, Ann. Appl. Probab. 3 Ž1992., 696]730. 6. P. Diaconis and L. Saloff-Coste, Nash inequalities for finite Markov chains, J. Theoret. Probab. 9 Ž1996., 459]510. 7. P. Diaconis and D. Stroock, Geometric bounds for eigenvalues of Markov chains, Ann. Appl. Probab. 1 Ž1991., 36]61. 8. I. H. Dinwoodie, A probability inequality for the occupation measure of a reversible Markov chain, Ann. Appl. Probab. 5 Ž1995., 37]43. 9. I. H. Dinwoodie, A bound on the rate of convergence for the discrete Gibbs sampler, Probab. Engrg. Inform. Sci. 9 Ž1995., 211]215. 10. J. A. Fill, Eigenvalue bounds on convergence to stationarity for nonreversible Markov chains, with an application to the exclusion process, Ann. Appl. Probab. 1 Ž1991., 62]87. 11. F. R. Gantmacher, ‘‘Matrix Theory,’’ Vol. II, Chelsea, New York, 1959. 12. D. Gillman, A Chernoff bound for random walks on expander graphs, in ‘‘Proceedings of the 34th Symposium on Foundations of Computer Science,’’ IEEE Comput. Soc., Los Alamitos, CA, 1993. 13. T. Kato, ‘‘Perturbation Theory for Linear Operators,’’ Springer-Verlag, New York, 1966. 14. M. G. Krein and M. A. Rutman, Linear operators leaving invariant a cone in a Banach space, in Amer. Math. Soc. Transl., Vol. 26, pp. 1]128, Amer. Math. Soc., Providence, 1950. 15. G. F. Lawler and A. D. Sokal, Bounds on the L2 spectrum for Markov chains and Markov processes: A generalization of Cheeger’s inequality, Trans. Amer. Math. Soc. 309 Ž1988., 557]580. 16. S. P. Meyn and R. L. Tweedie, Computable bounds for geometric convergence rates of Markov chains, Ann. Appl. Probab. 4 Ž1994., 981]1011. 17. R. Motwani and P. Raghavan, ‘‘Randomized Algorithms,’’ Cambridge Univ. Press, New York, 1995. 18. F. Rellich, Storungstheorie der Spektralzerlegung, IV, Math. Ann. 117 Ž1940., 356]382. ¨ 19. A. F. M. Smith and G. O. Roberts, Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods, J. Roy. Statist. Soc. Ser. B 55 Ž1993., 3]23.