Two-stage procedures for parameters in a growth curve model

Two-stage procedures for parameters in a growth curve model

Journal of Statistical Planning and Inference 105 22 (1989) 105-115 North-Holland TWO-STAGE PROCEDURES FOR PARAMETERS IN A GROWTH CURVE MODEL T...

586KB Sizes 0 Downloads 45 Views

Journal

of Statistical

Planning

and Inference

105

22 (1989) 105-115

North-Holland

TWO-STAGE PROCEDURES FOR PARAMETERS IN A GROWTH CURVE MODEL Tatsuya

KUBOKAWA

Institute

of Mathematics, University of Tsukuba, Tsukuba, Ibaraki 305, Japan

Received

7 December

Recommended

1987

by S. Zacks

Abstract: For parameters

in a growth

cedure such that its covariance number efficient

is asymptotically stopping

number

matrix

efficient.

curve model, is bounded

the paper gives a two-stage

above by a preassigned

Also a fixed-width

confidence

estimation

pro-

matrix and its stopping

region with an asymptotically

is proposed.

AMS Subjeci Classification: 62L12. Key words and phrases: Growth

curve model;

two-stage

estimation

procedure;

asymptotic

effi-

ciency.

1. Introduction random vectors, x, having Let x,,x2, . . . be a sequence of mutually independent p-variate normal distribution N’(B 0, we want to construct estimator [ of < such that Cov(vec

[) 5 sIgr

for all (<,C),

(1.1)

where the ordering between two positive definite matrices is defined by the nonnegative definiteness of their difference. Here for < = ({r, r2, . . . , (,), the notation vet < denotes qr x 1 vector (<;, r;, . . . , c$)’ and 14,.designates the qr x qr identity matrix. 0378-375X/89/$3.50 @ 1989, Elsevier Science Publishers

B.V. (North-Holland)

106

T. Kubokawa / Two-siage procedures

(II) Fixed-width confidence region. Given preassigned d>O, we want to find a region R of vet 4 such that (maximum

of R) 5 2d,

diameter

P[vec[ER]

1-a

1

numbers

O
1 and

(1.2)

for all (<,.Z).

(1.3)

Rao (1973, pp. 486-487) obtained a procedure resolving problem (I) in the univariate case, and the multivariate case was treated by Takada (1988), whose problem is defined as looking for an estimator such that its risk function relative to the quadratic loss is bounded above by a preassigned constant. Problem (II) has been studied by many authors. Of these, Stein (1945) first suggested a two-stage method in the univariate case, and a multivariate extension was developed by Healy (1956). For detail references, see Ghosh and Mukhopadyay (1976). Here, we shall investigate the case where Z is known. A procedure satisfying the requirement of problem (I) is given as follows: Throughout the paper, let m. be the smallest integer n (?r) such that the rank of A, is r. For integer nlmo, the MLE of < is of the form

which is equivalently

written

as in Lee (1974)

vet to(n) = {(A,A’,)-‘A,

@ (B’Z-lB)plB’Z-l)

vecX,,

where A @ B stands for the Kronecker product defined by (ajj B) for A = (ajj). Here we used the identity vec(BrA)=(A’@ B) vet r. A good account can be seen in Muirhead (1982) for the relation between Kronecker product and vet operator. It is easy to see that vet to(n) has qr-variate normal distribution with mean vet r and Cov(vec &(n)) = (A,A’,)-’ Then,

the condition

(&A;)-’ which is equivalent

0 (B’C-‘B)-‘.

(1.1) is expressed

@ (B’EplB)pl

by

5 &Zqr,

(1.4)

to

sup {t’{(A,A’,))’

@ (B’Z-‘B)-‘}

t/t’t} I E.

ICRY’

The 1.h.s. of this inequality is A,((A,A’,))’ @ (B’E-‘B)-‘), or A’((B’Z-‘B)-‘)/ where A,(U) and A,.(U) designate the largest and the smallest characteristic roots of r x r matrix U, respectively. Then the inequality (1.4) is guaranteed by taking n = n* defined by

A,.(A,Ak),

n* = smallest Thereby

nr

to(n*) is the estimator

m, such that ~,(A,A~)LI’((B’~-‘B)~‘)/E. solving

problem

(I).

(1.5)

107

T. Kubokawa / Two-stage procedures

For problem

(II), arguments

as in Healy (1956) are available.

A region

we look

at is R,(n) = {vet r 1(vet f&z) -vet

t)‘{A,AL

OB’Y’B}

. WC b(n) --ec 5) 33, where f is the upper lOOa% point of a chi-square distribution with qr degrees of freedom, that is, P[~$lf] = 1 -a. Clearly, P[vec
the maximum

diameter

of &(n)

is 2[f.

A,({A,,Ak @ B’T’

B}p’)]“2,

or

which gives that the condition (1.2) is represented by n,((B’C-‘B)~‘)f/~,.(A,,A~)~ d2. This inequality holds if we choose n = .** defined by n** = smallest

integer

&.(A,A’,)

n 2 m, such that

z A,((B’X-‘B)-‘)f/d”.

(1.6)

Therefore, R,(n**) is the region satisfying the requirements of problem (II). Since ,Z is unknown, however, there does not exist any fixed sample size such that problem (I) or (II) holds. So, it is desired to obtain estimation procedures resolving (I) or (II). In Section 2, we construct a two-stage estimation rule for problem (I). Also it is shown that its stopping number is asymptotically efficient in the sense of Chow and Robbins (1965). Section 3 gives fixed-width confidence regions with asymptotically efficient stopping number for problem (II). The results obtained in Sections 2 and 3 are applied in Section 4 to two problems: (1) estimation of a multinormal mean and (2) estimation of a normal common mean. 2. Bounded For which (i) having (ii)

covariance

estimator

preassigned number E > 0, we propose the following two-stage estimation rule resolves the problem (I). Start with m observations xl,xZ, . . . ,x,,, for mrmax(mo,P+r+2), each x, N,(B
integer

mrm

such that

> e(m_p-rrll)(/;;;+q_r_l)

&((B’&,‘B)-‘),

(2.1)

where &,, = X,,,(z-A:,(A,,A:,,)~‘A,,)X:,, with X,,, = (x’, . . . ,x,,,) and A,,, = (a,, . . . , a,,,). (iii) Take a sample of size N-m, and estimate

< by

(2.2)

108

T. Kubokawa / Two-stage procedures

for XN=(Xm,x,,+l Then we get Theorem

,... ,+I

and AN=(A,,a,,+,,...,a~).

2.1. For E>O and the estimator f,,, given by (2.2), Cov(vec r^,) 5 E&.

(2.3)

Proof. Since XNAN=X,A~+x,l+la~+, + . ..+x.ah, the independence of X,AA and S,, gives that the conditional distribution of XNA> given S, has N,,,(B
Cov(vec r^,> = E[Cov(vec

0 G(S,,_Z)],

where

and Cov(. 1S,) designates the conditional covariance matrix given S,. Note that G(S,,Z) is independent of B’S;‘B by Lemma 2.1 of Sugiura and Kubokawa (1988). Then G(S,,Z) is independent of N, which implies that Cov(vec

r^,> = E[(ANAh)-‘]

0 [G(S,,Z)].

Here, Rao (1967), Williams (1967), Kubokawa (1988) showed that

which,

and

Olkin

(1972)

and

Sugiura

and

l).

(2.5)

from (2.4), yields Cov(vec

Hence,

Gleser

(2.4)

the inequality

r^,) =

m-r-l

E[(ANAh)-’

m-p+q-r-l

(2.3) is equivalent

sup (E[t’{(A,,,A’,)-’

@ (B’Z-‘II-‘].

to

@ (B’Z-‘B)-‘)

t]/t’t} I E(m ;-‘,“-i’-

lERQ’

The 1.h.s. of (2.5) is bounded

above

sup {t’{(ANAb)-’

E

by

@(B’C-‘B)-‘}t/t’t}

1ER’i’

I

= E[&((A,Ab)-‘@(B’Z-‘B)-‘)I = E[{&#?‘C-‘B)&(A,AX)1-l] 5 The inequality

c(m-p-r-l)(m-p+q-r-l) m-r-l

in (2.6) follows from the definition

E[~,(B’s,‘B)/~,(B’z-‘B)1.

(2.6)

of N in (2.1). Since E[B’S&‘B]

=

T. Kubokawa / Two-stage procedures

(m-p-r-

1))‘B’Z-‘B,

Corollary

2.6 of Cacoullos

109

and Olkin

(1965) gives that

5 (m-p-r-l)-‘L,(B’Z-‘I?).

E[A,(B’S;‘B)]

(2.7)

From (2.7), the r.h.s. of the inequality in (2.6) is smaller than e(m -p+ q - r- l)/ (m -r - l), which is equal to the r.h.s. in (2.5). Therefore the proof of Theorem 2.1 is complete. Now we show the asymptotic efficiency of the stopping number in the sense of Chow and Robbins (1965). The method is due to that of Mukhopadhyay (1980) for the univariate case. 2.2. Assume that there exists a r x r positive definite matrix D such that

Theorem

lim n-‘(A,Ai)

= CC?.

n-m

(2.8)

Assume also that the initial sample size m is defined by m = max{m,,p+r+2,

[l/fi]+l),

where [u] denotes the largest integer less than u. Then (i) lim,+O N/n*= 1 a.s., (ii) lim,,, E [N] /n * = 1 (asymptotic efficiency), where the stopping numbers n* and N are defined in (1.5) and (2. l), respectively. Proof.

For simplicity, denote g(n) =nA,((A,Ai)-I), From the definition of N,

which converges

to Al(Q-‘)

as

n tends to infinity.

c(m)

-g(N)A,((B'S,-'B)-') &

IN
c(m) &

-

N

N-l

g(N-

l)A,((B’S;‘B)-‘)

c(m)=(m-r-l)/{(m-p-r-l)(m-p+q-r-l)}. of n*,

+ m,

(2.9)

On

the

other

hand,

from the definition

Sn*
1 n* -g(n*-l)A,((B’Z-‘I?-‘)+mO. E n*-1

(2.10)

(2.9) and (2.10) gives that -1

c(m)g(N)Al((B’S,-‘B)-‘) g(N-

g(n*-l)~l((B’Z-lB)-l)+~m, l)A,((B’S;‘B)-‘)

. [g(n*)I,((B’Z-‘B)-‘)I-‘.

+&m

1

1 (2.11)

110

T. Kubokawa / Two-stage procedures

Here, we note that if E tends to zero, then rn -03, cm -0, n*-+o3, N+cc a.s., A,((B’S,;‘B))‘)/m + A,((B’E-‘B))‘) a.s. Hence it is clear that both sides of N/n* in (2.11) almost surely converge to 1, and we get part (i). Similarly, we can prove part (ii) if the following hold: FyE[g(N)A,((B’S,,‘B)-‘)/ml lim E &+O =

N -g(N-

=

A,(W’)A,((B’E-‘B)-‘),

l)A,((B’S,-‘B)-‘)/m

N-l

(2.12)

1

~,(a~‘)~,((B’C~‘B)~‘).

(2.13)

For (2.12), it must be shown that g(N)1,((B’S,,‘B)-‘)/m is bounded above by an integrable function which is independent of m. Since lim,,, g(n) = Ar(C’), there is some no such that for any n > no, g(n) < A, (LX’) + 1. Hence for N > no, g(N) < At(!Xr)+l, so that for any Nzl, g(N) 0, we can see that

=c,

(say).

(2.14)

matrix P such that S,,Jm = P’diag(l,,

. . . , IJP

-1

A,((B’S,;‘B)-‘)/m

=

i;f {u’B’(mS,;‘)Bu/u’u} 1

-I

L

J- inf {U’B’BU/U’U~

I

II u

1

= c, I, 5 cr sup tr(&/m),

(2.15)

mz2

where cr = {inf, u’B’Bu/u’u}-‘. g(N)AI((B’S,;’

Hence

B)-‘)/m

< cot, ;;% tr(S,Jm).

Since E[su~,.~ tr(S,,Jm)]
confidence

from (2.14) and (2.15), (2.16)

by Takada (1988), the above by an integrable dominated convergence (2.12). Similarly, we can 2.2.

inequality (2.16) function which is theorem and the verify (2.13) and

region

For preassigned numbers 0< CZ< 1 and d>O, procedure which solves problem (II).

we propose

the following

two-stage

T. Kubokawa

/ Two-stage

(i) Start with m observations xl, . . . ,x, having N,(B&,,Z). (ii) Define the stopping number by

111

procedures

for m2max(m0,p+r+1),

N = smallest integer n 2 m such that A,(A,A;) 2 A,((B’S,-‘B)-‘)f,/&,

each xi

(3.1)

where f, is a constant satisfying P[&/A,(W)

If,]

= 1 - CY

(3.2)

for mutually independent random variables x$ and W, W having W,(I, m - r). (iii) Take a sample of size N-m and construct a confidence region of the form RN= {vec~I(vec&.,-vec<)‘{ANA>@B’S;lB} .(vec&-vec<)sfm}.

(3.3)

Then we get: Theorem 3.1. For any 0 < a < 1 and any d > 0, the maximum by (3.3) does not exceed 2d and

diameter of RN given

P[vec
(3.4)

Proof.

Since the maximum diameter of u’Au=l is 2[A1(A-‘)]“2, the maximum diameter of RN is 2[1,((A,A~)-1)I,((B’S,-1B)~1)f,]”2. From the definition of N given by (3.1), this maximum diameter is bounded above by 2[A1((ANA;\I))‘)x which is equal to 2d. Hence the first part of Theorem 3.1 {n,(A,A~)d2/f,}f,l”2, holds. To show (3.4), we set

The conditional of S,. Then,

distribution of U given S,,, has N,,,(O;I,,Ir),

(vet r^N-vec <)‘{[email protected]’S;‘B}(vec = tr(ANAh)1’2(&-

k;)‘B’Si’B(&

&-vet

which is independent 0

- l)(ANA;V)1’2

= tr UU’(B’S~1~~~1B)1’2(B’S~1B)-‘(B’S~1~S,1B)1’2 = tr UU’(C’C)1’2(C’diag(A,(W),

...,A,(W))C)-‘(C’C)“2

I tr UU’(C’C)1’2(C’diag(A,(W),

. . ..A.(W))C)-‘(C’C)“’

= [email protected],(W),

(3.5)

where W=_Z-1’2S,,,Z:-“2 and C=diag(A,(W-I), ...,Ap(W-1))PZ-1’2B for an orthogonal matrix P such that P W-‘P’= diag(Ar (W-l), . . . , A,( W-l)). Here the inequality in (3.5) follows from the fact that tr ABI tr AC for A >O and Bs C. Hence from the definition off,, P[vec c E RN] 1: P[x&/A,(W)

5 f,] = 1 - a,

T. Kubokawa / Two-stage procedures

112

which establishes (3.4), and Theorem 3.1 is proved. Next we can see that mf, is convergent to f as m tends to infinity since lim m-m n,(W/m) = 1 a.s. Then the same arguments as in the proof of Theorem 2.2 can give the asymptotic efficiency of the stopping number. 3.2. Assume that there exists a r x r positive definite matrix Q such that (2.8) holds. If the initial sample size m is defined by

Theorem

m=max{m,,p+r+l,[l/d]+l),

then a *s .7 (i) limd+oN/n**=1 (ii) lim,,, E [N]/n** = 1 (asymptotic

efficiency), where the stopping numbers n ** and N are defined by (1.6) and (3. l), respectively. Although the stopping number N given by (3.1) is asymptotically efficient, it may not be easy to look for the critical value f, in (3.2) because it is determined through the density of the smallest characteristic root A,(W). So, we suggest another stopping rule of the form

n 2 m such that 2 /l,((B’S,-‘B>-])f,*/d2,

M = smallest

integer

I,(A,A’,)

(3.6)

where f,* satisfies P[tr TW-‘
R$ = {vet < 1(vet rM ^ - vet ~)‘{A,[email protected]’S;‘B}(vec

fM-vec

<)sf:}. (3.7)

By Corollaries 10.4.3 obtainable by computing

and 10.4.6 of Muirhead the root of the following

I,,) fi dlj cp,rgp,ru,, i=l *..9

A(P) r,

!

*.-t

1,) n

if r
dli=l-cx

i=l

~ li~f,*,I,>...>l,>O

, 1

1192/2rp(F)/~p(~)rp(~)rp(~)j, P

gp,,(h,

if rip,

= 1 -(x

i=l

CP, f- =

that f,* is

r

cr,pgr,p(il,

t A(r)

where

(1982), it is noted equations:

. ..1$J

=

n i=l

and T’(x) = r~~(~~‘)‘~ nf==, T(x-

q-P-W

P II

(1+ li)m’2 i
(i- 1)/2).

(Ii -lj>*

Especially

for r= 1, the constant

f,* is

7. Kubokawa

simply

derived

P[F,(m

with (p, m -p)

F-distribution Theorem

from

-p)/plf,*] degrees

113

/ Two-stage procedures

= 1 -cc for random

variable

F, having

of freedom.

3.3. For any O< cz< 1 and any d > 0, the maximum

diameter of R$ of

(3.7) does not exceed 2d and

P[vec
5 1-a.

(3.8)

Proof. Similar to the proof of Theorem 3.1, it can be shown that the maximum diameter of RG does not exceed 2d. For (3.8), we first note that [email protected] {S;1’2B(B’S;1B)-1B’S,r’/2) is idempotent, which implies that for any vector u, u’u 22 u’{[email protected] s,- “2g(B’S,-‘B)-‘B’S,-1/2) Especially

u.

for u = vec(S,-1”XMA~((AMA~)-1’2), {vec(XMA~)}‘{(A,A~))‘@S;‘}vec(XMA~) 2 {vec(X,A~)}‘{(A,A~)-1~S~1B(B’S~1B)~’B’~~1}vec(X~A~).

Using

this inequality,

we have that

(vec[M-vec~)‘{[email protected]’S;lB}(vec[M-vec<) 5 (vet tM - vet <)‘{A,Ah + {vec(X,A~~)}‘{

0 B’S;‘B}(vec

(AMAL)-’

- {vec(XMLl~)}‘{(/lM~~)-’

& - vet r)

0 S;‘}vec(XMAL) @ s,;‘B(B’s,-‘B)-‘B’s,-‘)vec(X~~~)

= {vec(X~A~-B5A~A~))‘((A~IA~)-‘OS,-’)vec(X,A;,-BrA,A~) = tr(AMAh,) ~“2(X,A~-BrA,A~)‘s,-‘(X,A;,-BrA~~~)(A~~~)-” = tr TW-‘, where

W=.Z

-1/2s,z-1/2

and

T = E-1’2(XMA;, . {z-“2(x,A:, The random matrices T and and W,(Z, m - r), respectively. the desired result.

- B&lMA’M)(AMA~)~1’2 -BrA,A~)(A~1,IA~)-“2}‘.

W are mutually Hence,

independently distributed as W,(Z,r) P[vec r E RA] 2 P[tr TW-’ s f,*] = 1- cr, giving

Asymptotic properties of the stopping mf,* =f * where that lim,,, P[&S

number

M in (3.6) can be shown by noting

f *] = 1 -a.

Theorem 3.4. Under the same assumptions as in Theorem 3.2, (i) lim+e M/n**=f */f 21 a.s.,

114

T. Kubokawa / Two-stageprocedures

(ii) limd,O E[M]/n**=f*j-rl, where the stopping numbers n** and Mare

dejined

by (1.6) and (3.6), respectively.

For p = q, the critical value f * is equal to f. In other words, M is asymptotically efficient when p = q, which is the case of an ordinary multivariate regression model.

4. Examples We now present

special examples

of the results given in Sections

2 and 3.

Example 4.1 (Multivariate normal mean). Set q =p, r= 1, B= I and a; = 1 for i= 1,2, . . . . Then the random variable xi defined in the first paragraph of Section 1 has N,(&C), and&=(1 ,..., l), so that A,Ah = n. For problem (I), the stopping number given by (2.1) is written as N=max{m,[A,(S,)/{~(m-p-2)}]+1}, where [u] denotes

the largest integer

S, = 2

mrp+3,

being smaller

(x;-_Q(X~-X~)

than u, and

with X~ ciE, xi/m.

i=l

From (2.2) and Theorems 2.1 and 2.2, it follows that COV(X~)IE~~ CE, Xi/N, and that N is asymptotically efficient. For problem (II), the stopping number (3.6) is expressed as M=max{m,[~1(S,,,)f,*/d2]+1}, where, in this case, tion with (p, m -p) R;

for XN=

mzp+2,

f,* satisfies P[F,(m -p)/p degrees of freedom.

= (< [email protected],,, - O’&?(&

sf,*] = 1 -a for F, having F-distribuThe confidence region for r is - t>s6,*>,

and from Theorem 3.3, the region R,$ is the solution of problem (II), which was obtained by Healy (1956). Also Theorem 3.4 gives the asymptotic efficiency of M, which was derived by Takada (1987). Example 4.2 (Common mean). Set q = 1, r = 1, B = e = (1, . . . , l)‘, r =p, ai = 1 for For problem (I), i-l,2 ) . . . . Then the random variable Xi in Section 1 has N,(ep,Z). the stopping number (2.1) is

The stopping number N is asymptotically efficient by Theorem 2.2. From (2.2) and Theorem 2.1, it follows that Var(,G,)<& for fiN =e’S;‘xN/e’S;‘e. To solve problem (II), Theorem 3.1 is useful and the critical value f, can be deter-

T. Kubokawa / Two-stage procedures

mined

by P[x~/A,(W)lfm]

= 1 -cr.

Based

on this value,

115

the stopping

number

of

(3.1) is N=

max{m,[fm/{d2e’S;1e}]+l},

which is asymptotically sented by

efficient

m~pf2,

by Theorem

3.2. The confidence

region (3.3) is repre-

R, = {,u 1Ne’S~1e(,&-p)2sfm} and from Theorem

3.1, R,

satisfies

the requirement

of problem

(II).

Acknowledgement The author

is grateful

to Professor

N. Sugiura

for his helpful

advise.

References Cacoullos,

T. and I. Olkin (1965). On the bias of functions

of characteristic

roots of a random

matrix.

Biometrika 52, 81-94. Chow,

Y.S. and H. Robbins

Chosh,

M. and

(1965). On the asymptotic

theory of fixed width sequential

confidence

inter-

Ann. Math. Statist. 36, 457-462.

vals for the mean.

N. Mukhopadyay

(1976).

On two fundamental

problems

of sequential

estimation.

Sankhyn A 38, 203-218. Gleser,

L.J. and I. Olkin (1972). Estimation

for a regression

model with an unknown

covariance

matrix.

Proc. Sixth Berkeley Symp. Math. Statist. Probab. Vol. 1, 541-568. Healy,

W.C.

Jr. (1956).

Two-sample

procedures

in simultaneous

Ann. Math. Statist. 27,

estimation.

687-702. Lee, Y.K. (1974). A note on Rao’s reduction

of Potthoff

and Roy’s generalized

linear model. Biometrika

61, 349-35 1. Muirhead,

R.J.

Mukhopadyay,

(1982). Aspects of Multivariate Statistical Theory. Wiley, New York. N. (1980). A consistent

ed width confidence Potthoff,

R.F.

especially

and

intervals S.N.

for growth

and asymptotically

for the mean.

Roy (1964).

A generalized

curve problems.

efficient

two-stage

procedure

to constuct

multivariate

analysis

of variance

model

useful

Biometrika 51, 3 13-326.

Rao, C.R. (1967). Least squares theory using an estimated dispersion matrix and its application measurement of signals. Proc. Fifth Berkely Symp. Math. Statist. Probab. Vol. 1, 355-312. Rao,

C.R.

(1973). Linear Statistical Inference and Its Application,

Stein, C. (1945). A two-sample

fix-

Metrika 27, 281-284.

test for a linear hypothesis

2nd ed. Wiley,

to

New York.

whose power is independent

of the variance.

Ann. Math. Statist. 16, 245-258. Sugiura,

N. and T. Kubokawa

(1988). Estimating

common

parameters

of growth

curve models.

Arm.

Inst. Statist. Math. 40, 119-135. Takada, Y. (1988). Two-stage 1, 1-8. Williams,

J.S.

1290-1301.

(1967).

procedures

The variance

for a multivariate

of weighted

regression

normal

distribution.

estimators.

J. Amer.

Kumamoto J. Math. Statist. Assoc. 62,