Moderate deviations for nonhomogeneous Markov chains

Moderate deviations for nonhomogeneous Markov chains

Statistics and Probability Letters 157 (2020) 108632 Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: ...

326KB Sizes 1 Downloads 18 Views

Statistics and Probability Letters 157 (2020) 108632

Contents lists available at ScienceDirect

Statistics and Probability Letters journal homepage: www.elsevier.com/locate/stapro

Moderate deviations for nonhomogeneous Markov chains✩ ∗

Mingzhou Xu , Kun Cheng, Yunzheng Ding School of Information Engineering, Jingdezhen Ceramic Institute, Jingdezhen, 333403, Jiangxi, PR China

article

info

Article history: Received 3 July 2019 Received in revised form 27 August 2019 Accepted 19 September 2019 Available online 26 September 2019

a b s t r a c t In this work, we establish moderate deviation principles for bounded functionals and empirical measures for nonhomogeneous Markov chains with finite state space under the condition of convergence of transition probability matrices for nonhomogeneous Markov chains in Cesàro sense. © 2019 Elsevier B.V. All rights reserved.

MSC: 60F10 60J10 Keywords: Nonhomogeneous Markov chains Moderate deviations Empirical measures Martingale

1. Introduction and main results Huang et al. (2013) proved central limit theorem for nonhomogeneous Markov chains with finite state space by using the martingale central limit theorem (see Brown, 1971) and the strong law of large numbers of nonhomogeneous Markov Chains (see Yang, 2002) under the condition of convergence of transition probability matrices for nonhomogeneous Markov chains in Cesàro sense. For recent references on limit properties for nonhomogeneous Markov Chains, we refer to Zhang et al. (2016) and references therein. Donsker and Varadhan (1975a,b, 1976, 1983) established large deviations for empirical measures, and process-level large deviations for homogeneous Markov processes. Dietz and Sethuraman (2005) studied large deviations for a class of nonhomogeneous Markov chains under some regularity conditions. Gao (1992, 1996) obtained moderate deviation principles for homogeneous Markov processes and martingales. Wu (1995) established moderate deviations of dependent random variables including homogeneous Markov processes. de Acosta (1997) studied moderate deviations lower bounds for homogeneous Markov chains. de Acosta and Chen (1998) established moderate deviations upper bounds for homogeneous Markov chains. Gao (2017) proved long time asymptotics of unbounded additive functionals of Markov processes including moderate deviations for unbounded functional of homogeneous Markov processes. For more references on large deviations and moderate deviations for Markov processes, we refer to Gao (2017), Wu (2001) and references therein. Xu et al. (2019a) studied central limit theorem and moderate deviation for empirical means for countable nonhomogeneous Markov chains. Gao and Xu (2012) established large deviations for the empirical measures for independent random variables under sublinear expectation. Xu et al. (2019b) obtained a moderate deviation principle for empirical measures for countable nonhomogeneous Markov chains under the condition of uniform convergence of transition probability matrices for countable nonhomogeneous Markov chains in Cesàro sense. Motivated ✩ The research is supported by Scientific Program of Department of Education of Jiang Xi Province of China Grant GJJ150894. ∗ Corresponding author. E-mail addresses: [email protected] (M. Xu), [email protected] (K. Cheng), [email protected] (Y. Ding). https://doi.org/10.1016/j.spl.2019.108632 0167-7152/© 2019 Elsevier B.V. All rights reserved.

2

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

by their work, in this note, we give simple proofs to study moderate deviation principles for bounded functionals and empirical measures for nonhomogeneous Markov chains with finite state space by using Gärtner–Ellis theorem. Moderate deviation principle fills in the gap between the central limit theorem and the large deviation principle. Our results complement that in Dietz and Sethuraman (2005). Gao (2017) proved moderate deviations for homogeneous Markov processes under hypercontractivity and Lp -integrability of transition density for some p > 1, which are general, we establish moderate deviations for nonhomogeneous Markov chains with finite state space under very strict (1.5) and (1.7). Let {Xn , n ≥ 0} defined on the probability space (Ω , F , P) be a nonhomogeneous Markov chain taking values in state space S = {1, 2, . . . , κ} with initial distribution

µ = (µ(1), µ(2), . . . , µ(κ )),

(1.1)

and transition matrices Pn = (pn (i, j)), i, j ∈ S , n ≥ 1,

(1.2)

where pn (i, j) = P(Xn = j|Xn−1 = i). Then

P(X0 = x0 , X1 = x1 , . . . , Xn = xn ) = µ(x0 )

n ∏

pk (xk−1 , xk ).

k=1

Write P (m,n) = Pm+1 Pm+2 · · · Pn , p(m,n) (i, j) = P(Xn = j|Xm = i),

µ(k) = µ(0) P1 P2 · · · Pk , µ(k) (j) = P(Xk = j). When the Markov chain is homogeneous, we denote P, P k by Pn , P (m,m+k) respectively. If P is a stochastic matrix, then we write

δ (P) = sup i ,k

κ ∑ [p(i, j) − p(k, j)]+ , j=1

where [a]+ = max{0, a}. Let A = (aij ) be a matrix defined as S × S. Write

∥A∥ = sup i∈S



|aij |.

j∈S

Suppose that R is a ‘constant’ stochastic matrix each row of which is the same. The sequence {Pn , n ≥ 1} is said to converge in the Cesàro sense (to a constant stochastic matrix R) if for every m ≥ 0

  n  ∑   P (m,m+t) /n − R = 0. lim  n→∞  

(1.3)

t =1

By Theorem 1 in Yang (2002), if (1.5) holds, then (1.3) holds. Throughout this paper, E(·) is the expectation under probability measure P, R is∑ a constant stochastic matrix each row κ of which is the left eigenvector π = (π (1), . . . , π (κ )) of P satisfying π P = π and i=1 π (i) = 1. We assume that a(n) a(n) lim √ = ∞, lim = 0. n→∞ n n

(1.4)

n→∞

In addition, let f be any function defined on S. We set Sn =

n ∑

f (Xk ).

k=1

For all ν ∈ M(R), we define Iµ (ν ) :=

⎧ ⎨

1∑



⎫ ⎬ f (j)p(i, j))2 ] ⎭

⟨f , ν⟩ − π (i)[f 2 (i) − ( ⎩ 2 i∈S j∈S ⎧ ⎫ ⎨ ⎬ ∑ 1 ∑∑ = sup π (i)p(i, j)[f (j) − p(i, k)f (k)]2 , ⟨f , ν⟩ − ⎭ 2 f ∈Bb (R) ⎩ sup

f ∈ B b (R )

i∈S

j∈S

k∈S

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

3

where ⟨f , ν⟩ = R f (x)ν (dx), Bb (R) denotes the linear space of bounded measurable functions, and M(R) is the space of all real measures of finite variation on R. The τ -topology on M(R) is induced by the mappings ν ∈ M(R) ↦ → ⟨f , ν⟩, f ∈ Bb (R). The space M(R) is endowed with σ -fields Ds generated by these mappings. Set ) 1 ( Γn (B) := δX1 (B) + · · · + δXn (B) , B ∈ B (R) a(n) cn (A) := P (Γn − E(Γn ) ∈ A) , A ∈ M(R),



where δx (·) is the Dirac measure. Definition 1.1. We say P = (p(i, j))S ×S is irreducible if, for any i, j ∈ S, there is a finite path i = x0 , x1 , . . . , xn = j in S with positive weight, p(x0 , x1 ) · · · p(xn−1 , xn ) > 0. Our main results are the following. Theorem 1.1. Let {Xn , n ≥ 0} be a nonhomogeneous Markov chain taking values in state space S = {1, 2, . . . , κ} with initial distribution of (1.1) and transition matrices of (1.2). Let f be bounded function defined on the state space S. Suppose that P = (p(i, j))κ×κ is another transition matrix and P is irreducible, π = (π (1), π (2), . . . , π (κ )) is the unique stationary distribution determined by the transition matrix P. Assume that n 1∑

lim

n

n→∞

|pk (i, j) − p(i, j)| = 0, ∀i, j ∈ S ,

(1.5)

∑ π (i)[f 2 (i) − ( f (j)p(i, j))2 ] > 0.

(1.6)

k=1

and



θ (f ) :=

i∈S

j∈S

Moreover, if the sequence of δ -coefficient satisfies

∑n

k=1

lim



n→∞

then {

δ (Pk ) n

= 0,

(1.7)

Sn −E(Sn ) a(n)

lim

n→∞

} satisfies the moderate deviation principle with rate function I(x), i.e., for any open set G ⊂ R1 , ( ) n Sn − E(Sn ) log P ∈ G ≥ − inf I(x), 2

a (n)

a(n)

x∈G

and for any closed set F ⊂ R1 , lim

n→∞

(

n a2 (n)

log P

x2

where I(x) =

2θ (f )

Sn − E(Sn ) a(n)

) ∈F

≤ − inf I(x), x∈F

. In particular,

⏐ (⏐ ) ⏐ Sn − E(Sn ) ⏐ x2 ⏐ ⏐ lim 2 log P ⏐ > x = − . ⏐ n→∞ a (n) a(n) 2θ (f ) n

By definition of Iµ (ν ) and Theorem 2.6 in Gänssler (1971), we see that Iµ (ν ), which is rate function for moderate deviations for empirical measures of nonhomogeneous Markov chains in Theorem 1.3, is τ -compact. Theorem 1.2.

For any l ∈ (0, ∞), {ν ∈ Mb (R), Iµ (ν ) ≤ l} is τ -compact,

By Theorem 1.1, we obtain moderate deviations for empirical measures of nonhomogeneous Markov chains below, which are generalized in the case of countable Markov chains (Xu et al., 2019b). Here, Iµ (·) is said to be good rate function, if ∀x ≥ 0, {ν ∈ M(R), Iµ (ν ) ≤ x} is τ -compact. Theorem 1.3. Suppose that {Xn , n ≥ 0} is a nonhomogeneous Markov chain taking values in S = {1, . . . , κ} with initial distribution of (1.1) and transition matrices of (1.2). Suppose that P = (p(i, j))κ×κ is irreducible, π = (π (1), π (2), . . . , π (κ )) is the unique stationary distribution determined by the transition matrix P. Assume that (1.5) and (1.7) hold. Then {cn , n ≥ 1} a2 (n) satisfies moderate deviation principle in (M(R), τ ) with speed n and with good rate function Iµ (·). Namely, for any B ∈ Ds , lim sup n→∞

lim inf n→∞

n a2 (n) n

a2 (n)

log cn (B) ≤ − inf Iµ (ν ); ν∈B¯

log cn (B) ≥ − inf Iµ (ν ). ν∈B˚

4

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

We below present two examples, which are applications of the above results. Example 1.1. As in Theorem 1.1, we set κ = n0 > 2, α > pk (m, m) =

1 n0



1 , n0 kα 1 n0

pk (n0 , n0 − 1) =

for 1 ≤ m ≤ n0 . pk (l, l + 1) =

+

1 n0 kα

. pk (n0 , j) =

1 , n0

∑n 1

limn→∞ n k=1 |pk (i, j) − p(i, j)| = limn→∞ and (1.7) holds.

1 n0

1 n0 k α

+

2 k=1 n0 kα

Pk = (pk (i, j)), pk (m, m) =



(log(k))β n kα 0

, pk (l, m) =

1 , 2

pk (l, l − 1) = limn→∞

∑n

k=√ 1 δ (Pk )

n

for 2 ≤ l ≤ n0 − 1 Then 2(log(k))β k=1 n0 k α

∑n

k=√ 1 δ (Pk )

n

limn→∞ 1n

for 1 ≤ i, j ≤ n0 , Pk = (pk (i, j)),

1 , n0

for 2 ≤ l ≤ n0 − 1. Then ∑n

= limn→∞

2 k=1 n0 kα √

n

β > 0, P = (p(i, j)), p(i, j) =

, for 1 ≤ m ≤ n0 . pk (l, l + 1) =

1 ≤ l ≤ n0 − 1, 1 ≤ m ≤ n0 , |m − l| > 1. pk (n0 , n0 − 1) = 2(l−1) , n0 (n0 +1)

1 , n0

for 1 ≤ l ≤ n0 − 1, 1 ≤ m ≤ n0 , |m − l| > 1.

1 n0

= 0, and limn→∞

Example 1.2. As in Theorem 1.1, we set κ = n0 > 2, α > 2m n0 (n0 +1)

P = (p(i, j)), p(i, j) =

for 1 ≤ j ≤ n0 − 2, pk (l, l − 1) =

∑n 1 n

1 , 2

2(n0 −1) (log(k))β . n0 (n0 +1) n0 kα n p(i k=1 pk (i j)

+



, −

|

2(l+1) n0 (n0 +1)

,

+

2j , n0 (n0 +1) (log(k))β , p (l k n kα

= 0, i.e. (1.5)

for 1 ≤ i, j ≤ n0 ,

, m) =

0

2m n0 (n0 +1)

2j pk (n0 j) , for 1 j n0 (n0 +1) n (log(k))β 1 j) limn→∞ n k=1 n0 kα

=

, | =





for

≤ n0 − 2, = 0, and

∑n

= limn→∞



n

= 0, i.e. (1.5) and (1.7) holds.

2. Proof of Theorem 1.1 Proof of Theorem 1.1. As in Huang et al. (2013), denote Dn,f := f (Xn ) − E [f (Xn )|Xn−1 ], n ≥ 1, D0,f = 0.

(2.1)

Write n ∑

Wn,f :=

Dk,f .

(2.2)

k=1

Put Fn = σ (Xk , 0 ≤ k ≤ n). Then {Wn,f , Fn , n ≥ 1} is a martingale, hence {Dn,f , Fn } is the related martingale difference. Note that n ∑ [E [f (Xk )|Xk−1 ] − E [f (Xk )]].

Sn − E(Sn ) = Wn,f +

(2.3)

k=1

Huang et al. (2013) have already obtained n 1∑

lim

n→∞

n

E(D2k,f ) = θ (f )

(2.4)

k=1

and

∑n

k=1

lim

|[E [f (Xk )|Xk−1 ] − E [f (Xk )]]| ≤ lim 2M √ n

n→∞

∑n

k=1



δ (Pk )

n

n→∞

= 0,

(2.5)

where M = supi∈S |f (i)|.{(2.4) implies informally (2.7) that is used to prove (2.6). It is important to use (2.5) to prove } moderate deviations for lim sup n→∞

n a2 (n)

from that for {Wn,f /a(n)}. Therefore, we need to prove that for any t ∈ R1 ,

Sn −E(Sn ) a(n)

[

[

log E exp

a(n) n

]] tWn,f

=:

t2 2

θ (f ).

(2.6)

Indeed, by Theorem 1.1 in Gao (1996), it suffices to check that there exists δ > 0 such that sup E(exp(δ Dm+1,f )|Fm )L∞ (P) < ∞,





m≥0

which is obvious, and

  ( m ) 1   2 ∑ 2  2 lim inf sup  t E Dn+j+i,f |Fj − t θ (f ) n ∞ n→∞, m →0 j≥0  m i=1

By the proof of (2.13) in Xu et al. (2019a), with in Xu et al. (2019a), we see that lim

m→∞

m 1 ∑{

m

k=1

L

1 m

= 0.

(2.7)

(P)

∑n+j+m

k=n+j+1

in place of

1 n

∑n

k=1

in the proof of (2.13) from (2.7) to (2.13)

E(f 2 (Xk+n+j )|Fk+n+j−1 ) − (E(f (Xk+n+j )|Fk+n+j−1 ))2 = θ (f ) a.e.

}

(2.8)

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

5

For reader’s convenience, we give a complete proof of (2.8) here. Denote by m } 1 ∑{ 2 E(f (Xk+n+j )|Fk+n+j−1 ) − (E(f (Xk+n+j )|Fk+n+j−1 ))2 =: I1 (m) − I2 (m), m k=1

where m 1 ∑

I1 (m) =

m

E(f 2 (Xk+n+j )|Fk+n+j−1 ) =

∑∑ l∈S

k=1

f 2 (l)

i∈S

1 m

n+j+m



pk (i, l)δi (Xk−1 ),

k=n+j+1

m 1 ∑ (E(f (Xk+n+j )|Fk+n+j−1 ))2 m

I2 (m) =

(2.9)

k=1

∑∑ 1

=

m

i∈S l,ℓ∈S

n+j+m



f (l)f (ℓ)pk (i, l)pk (i, ℓ)δi (Xk−1 ).

k=n+j+1

We first use (1.5) and Fubini’s theorem to obtain lim

n+j+m

∑∑ 1

m→∞ i∈S

m→∞

δi (Xk−1 )|pk (i, l) − p(i, l)|

k=n+j+1

∑ ∑

m

δi (Xk−1 )



k=n+j+1 i∈S

|pk (i, l) − p(i, l)|

(2.10)

l∈S

n+j+m

1

m→∞

m

n+j+m

1

≤ lim

≤ lim

l∈S



∑ ∑

m

|pk (i, l) − p(i, l)| = 0.

k=n+j+1 l∈S

Hence, it follows from (2.10), Corollary 4 in Yang (2009), and π P = π that lim I1 (m) = lim

m→∞

∑∑

m→∞ l∈S

=

∑∑ l∈S

=



f 2 (l)

i∈S

1 m

n+j+m



p(i, l)δi (Xk−1 )

k=n+j+1

f 2 (l)p(i, l)π (i)

(2.11)

i∈S

f 2 (l)π (l), a.e.

l∈S

We next claim that

]2

[ lim I2 (m) =



m→∞

π (i)

i∈S



f (l)p(i, l)

, a.e.

(2.12)

l∈S

Indeed, we use (1.5) and (2.9) to have

⏐ ⏐ ⏐ ⏐ n+j+m ∑∑ ⏐ ⏐ 1 ∑ ⏐I2 (m) − ⏐ f (l)f ( ℓ ) p(i , l)p(i , ℓ ) δ (X ) i k − 1 ⏐ ⏐ m ⏐ ⏐ i∈S l,ℓ∈S k=n+j+1 ⏐ ⏐∑ ∑ n+j+m ⏐ 1 ∑ ≤ ⏐⏐ f (l)f (ℓ) δi (Xk−1 )(pk (i, l) − p(i, l))pk (i, ℓ) m ⏐ i∈S l,ℓ∈S k=n+j+1 ⏐ ⏐ n+j+m ∑∑ ∑ ⏐ 1 + f (l)f (ℓ) δi (Xk−1 ) p(i, l)(pk (i, ℓ) − p(i, ℓ))⏐⏐ m ⏐ i∈S l,ℓ∈S k=n+j+1 ⎛ ⎞ n+j+m n+j+m ∑ ∑ ∑ ∑ 1 1 ≤ M2 ⎝ δi (Xk−1 )∥Pk − P ∥ + δi (Xk−1 )∥Pk − P ∥⎠ m

≤ 2M 2

1 m

k=n+j+1 i∈S

m

k=n+j+1 i∈S

n+j+m

∑ k=n+j+1

∥Pk − P ∥ → 0, as m → ∞.

6

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

Thus, we use Corollary 4 in Yang (2009) again to obtain

∑∑

lim I2 (m) =

m→∞

f (l)f (ℓ)p(i, l)p(i, ℓ) lim

m→∞

i∈S l,ℓ∈S

∑∑

=

n+j+m

1 m



δi (Xk−1 )

k=n+j+1

f (l)f (ℓ)p(i, l)p(i, ℓ)π (i)

i∈S l,ℓ∈S



=

π (i)[



i∈S

f (l)p(i, l)]2 a.e.

l∈ S

Hence (2.12) holds. Combining (2.11) and (2.12) results in (2.8). Therefore, by (2.8), lim inf sup

n →0 j≥0 n→∞, m

1 m

( 2

t E

m ∑

) D2n+j+i,f

|Fj − t 2 θ (f )

i=1

( sup E = lim inf n

1 m

n→∞, m →0 j≥0

( 2

t E

)

m ∑

D2n+j+i,f

)

|Fn+j+i−1 |Fj − t 2 θ (f ) = 0 a.e.,

i=1

∑m

(∑m

1 2 which by the uniform boundedness of m k=1 E i=1 Dn+j+i,f |Fj implies (2.7). Hence, by Gärtner–Ellis theorem (see Theorem 2.3.6 of Dembo and Zeitouni, 1998 or Theorem 1.4 of Wu, 1997 p.276), 2 {Wn /a(n)} satisfies the moderate deviation principle with rate function I(x) = 2θx(f ) . It follows from (1.4) and (2.5) that ∀ϵ > 0,

)

⏐ (⏐ ) ⏐ Sn − E(Sn ) Wn,f ⏐ ⏐ ⏐>ϵ lim 2 log P ⏐ − n→∞ a (n) a(n) a(n) ⏐ ( ∑n ) n k=1 [E [f (Xk )|Xk−1 ] − E [f (Xk )]] log P >ϵ = lim 2 n

n→∞

a (n)

a(n)

= 0. Thus, by the exponential equivalent method ( see Theorem 4.2.13 of Dembo and Zeitouni, 1998), we see that satisfies the same moderate deviation theorem as {Wn,f /a(n)} with rate function I(x) = Theorem 1.1. □

x2 . 2θ (f )

{

Sn −E(Sn ) a(n)

}

This completes the proof of

3. Proof of Theorem 1.2 Proof of Theorem 1.2. Since for any A ∈ B (R), ξ ∈ {ν ∈ M(R) : Iµ (ν ) ≤ l}, |ξ ∑ |(A) ≤ (l + 12 L2 k∈A π (k))/L, ∀L > 0. We see that for ξ ∈ {ν ∈ M(R) : Iµ (ν ) ≤ l}, ∀ϵ > 0, ∃δ > 0, ∀A ∈ B (R) satisfying k∈A π (k) < δ implies |ξ |(A) < ϵ . By the definition of Iµ (ν ), we deduce that {ν ∈ M(R) : Iµ (ν ) ≤ l} is lower semicontinuous in ν under τ -topology. We conclude that {ν ∈ M(R) : Iµ (ν ) ≤ l} is τ -compact (cf. Theorem 2.6, Gänssler, 1971). □



4. Proof of Theorem 1.3 Proof of Theorem 1.3. By (2.5), (2.6), and (1.4), we obtain Cramér functional Λ(f ) : Bb (R) ↦ → R as defined in the proof of Theorem 1.1,

Λ(f ) := lim

n→∞

= lim

n→∞

n a2 (n)

[

n a2 (n)

[

log E exp

[

[

log E exp

a(n)2 n

a(n)2 n

⟨f , Γn − E(Γn )⟩

]] ]]

n a(n)2 ∑ [Wn,f /a(n)] + ( [E [f (Xk )|Xk−1 ] − E [f (Xk )]])/a(n) n k=1

= θ (f )/2. Λ(f ) : Bb (R) ↦→ R is convex, Gateaux differentiable, and its Legendre transformation Λ∗ : M(R) ↦→ [0, +∞] is defined by

Λ∗ (ν ) := sup

f ∈Bb (R)

⎧ ⎨ ⎩

⟨f , ν⟩ −

1∑ 2

i∈S

π (i)[f 2 (i) − (

∑ j∈S

⎫ ⎬ f (j)p(i, j))2 ] = Iµ (ν ). ⎭

M. Xu, K. Cheng and Y. Ding / Statistics and Probability Letters 157 (2020) 108632

7

Theorem 2.7 in Yan et al. (Wu, 1997 p. 290) says that if Λ(f ) : Bb (R) ↦ → R is convex, Gateaux differentiable, and ∀f ∈ Bb (R), ∃δ > 0, we have Λ(δ f ) < +∞ (here the conditions all hold obviously), then for any B ∈ Ds , n log cn (B) ≥ − inf Iµ (ν ). lim inf 2 n→∞ a (n) ν∈B˚ By Theorem 2.1 in Yan et al. (Wu, 1997 p. 284), the fact that ν ∈ Bb (R)∗ (the algebraic dual) verifying

⎧ ⎫ ⎨ ⎬ ∑ 1∑ 2 2 ¯ (ν ) = sup ν (f ) − Λ π (i)[f (i) − ( f (j)p(i, j)) ] < +∞, ⎭ 2 f ∈Bb (R) ⎩ ∗

i∈S

j∈S

implies ν ∈ M(R) results in ∀B ∈ Ds , n log cn (B) ≤ − inf Iµ (ν ). lim inf 2 n→∞ a (n) ν∈B¯ As in the proof of Theorem 2.3 in Wu (1995), for proving such a ν is a measure, we need only show that limn→∞ ν (fn ) = 0 for every sequence (fn ) in Bb (R) verifying supn supx∈R |fn (x)| < ∞ and limn→∞ fn (x) = 0 ∀x ∈ R. To do this, by the expression of Λ(f ), we have

Λ(tfn ) → 0 for any such sequence fn and for all t ∈ R. Thus

+∞ > sup[t · ν (fn ) − Λ(tfn )] ≥ t · lim sup ( or lim infn→∞ )ν (fn ). n

n→∞

Since t is arbitrary, we obtain the desired result. This completes the proof of Theorem 1.3. □ Acknowledgments The authors are very grateful to the anonymous referees for carefully reading the original manuscript and pointing out many errors, as well as for valuable and helpful comments. The first author is also very grateful to Professor Fuqing Gao for enlightening advices and assistance. References de Acosta, A., 1997. Moderate deviations for empirical measures of Markov chain: lower bounds. Ann. Probab. 25, 259–284. de Acosta, A., Chen, X., 1998. Moderate deviations for empirical measures of Markov chain: upper bounds. J. Theoret. Probab. 11, 1075–1110. Brown, B.M., 1971. Martingale central limit theorems. Ann. Math. Stat. 42, 59–66. Dembo, A., Zeitouni, O., 1998. Large Deviations Techniques and Applications, second ed. Springer, New York. Dietz, Z., Sethuraman, S., 2005. Large deviations for a class of nonhomogeneous Markov chains. Ann. Probab. 15, 421–486. Donsker, M.D., Varadhan, S.R.S., 1975a. Asymptotic evaluation of certain Markov process expectations for large time, I. Comm. Pure Appl. Math. 28, 1–47. Donsker, M.D., Varadhan, S.R.S., 1975b. Asymptotic evaluation of certain Markov process expectations for large time, II. Comm. Pure Appl. Math. 28, 279–301. Donsker, M.D., Varadhan, S.R.S., 1976. Asymptotic evaluation of certain Markov process expectations for large timeIII. Comm. Pure Appl. Math. 29, 389–461. Donsker, M.D., Varadhan, S.R.S., 1983. Asymptotic evaluation of certain Markov process expectations for large time. IV. Comm. Pure Appl. Math. 36, 182–212. Gänssler, P., 1971. Compactness and sequential compactness in spaces of measures. Z. Wahrscheinlichkeitstheor. Verwandte Geb. 17, 124–146. Gao, F.Q., 1992. Moderately large deviations for uniformly ergodic Markov processes, research announcements. Adv. Math. 21, 364–365. Gao, F.Q., 1996. Moderate deviations for martingales and mixing random processes. Stochastic Process. Appl. 61, 263–275. Gao, F.Q., 2017. Long time asymptotics of unbounded additive functionals of Markov processes. Electron. J. Probab. 44, http://dx.doi.org/10.1214/17EJP104. Gao, F.Q., Xu, M.Z., 2012. Relative entropy and large deviations under sublinear expectations. Acta Math. Sci. 32, 1826–1834. Huang, H.L., Yang, W.G., Shi, Z.Y., 2013. The central limit theorem for nonhomogeneous Markov chains. Chinese J. Appl. Probab. Statist. 29, 337–347. Wu, L.M., 1995. Moderate deviations of dependent random variables related to CLT. Ann. Probab. 23, 420–445. Wu, L.M., 1997. An introduction to large deviation. In: Yan, J.A., Peng, S.G., Fang, S.Z., Wu, L.M. (Eds.), Several Topics in Stochastic Analysis. Press of Sciences of China, Beijing, pp. 225–336, (in Chinese). Wu, L.M., 2001. Uniformly integrable operators and large deviations for Markov processes. J. Funct. Anal. 172, 301–376. Xu, M.Z., Cheng, K., Ding, Y.Z., 2019b. Moderate deviations for empirical measures for nonhomogenenous Markov chains, Preprint. Xu, M.Z., Ding, Y.Z., Zhou, Y.Z., 2019a. Central limit theorem and moderate deviation for nonhomogeneous Markov chains. J. Math. (Wuhan) 39, 137–146. Yang, W.G., 2002. Convergence in the Cesàro sense ans strong law of large numbers for nonhomogeneous Markov chains. Linear Algebra Appl. 354, 275–286. Yang, W.G., 2009. Strong law of large numbers for nonhomogeneous Markov chains. Linear Algebra Appl. 430, 3008–3018. Zhang, H.Z., Hao, R.L., Ye, Z.X., Yang, W.G., 2016. Some strong limit properties for countable nonhomogeneous Markov chains. Chinese J. Appl. Probab. Statist. 32, 62–68.