1999,19(5):555563
.At~eta,9?cientia
1~4mJl!~m ASSESSMENT OF LOCAL INFLUENCE IN A GROWTH CURVE MODEL WITH RAO'S SIMPLE COVARIANCE STRUCTURE 1 Bai Peng ( ~. ) Department of Statistics, Yunnan University, Kunming 650091, China Institute of Applied Mathematics, Yunnan Province, Kunming 650091, China Abstract The present paper deals with the problem of assessing the local influence in a growth curve model with Rao's simple covariance structure. Based on the likelihood displacement.the curvature measure is employed to evaluate the effects of some minor perturbations on the statistical inference, thus leading to the large curvature direction, which is the most critical diagnostic statistic in the context of the local influence analysis. As an application, the common covarianceweighted perturbation scheme is thoroughly considered.
Key words Growth curve model, Rao's simple covariance structure, wmodel, Likelihood displacement, curvature 1991 MR Subject Classification 62H
1
Introduction It is all well known that, if there exist the mmor perturbations in some factors of the
statistical model, then it is the influence analysis to demonstrate how to evaluate the effects of the minor perturbations on the statistical inference. Hampel!l) reduced the above perturbation to originate from the relevant Aistribution function changing from F to F + !1F, thus, he resulted in a series of essential statistics such as the influence function et al. Furthermore.if the perturbations is explicit, regardless of the distribution function, we then usually start from the likelihood function, the most important concept in the statistics, to describe concretely the influences of all kinds of perturbation schemes on the statistical inference. Cook(2) introduced the concept of curvature measure into the local influence analysis, which proclaimed that the differential geometry had permeate into some aspects in the statistics. Contrary to Cook's model, the model considered in this paper is more complicated, which can be expressed as follows:
Y =XBZT +E,
(Ll)
where B is a p x m unknown parameter matrix, X is a q x p known design matrix, Y is a q x n response matrix, Z is an n x m known constant matrix, rk(X) = p and rk(Z) = m, 1 Received
Jul.2,1997; revised Aug.21,1998.
556
ACTA MATHEMATICA SCIENTIA
Vo1.19
respectively. Further, the n columns of the q x n error matrix E are assumed to be independent qvariate normal vectors with the expectation vector 0 and the common unknown covariance matrix
~
> 0,
i.e. (1.2)
where AT, rk(A) and A l ® A z denote the transpose, rank of the matrix A and the Kronecter product of two matrices A l and A z • respectively. The model (1.1) was first proposed by Potthoff and Roy[3] and referred as a growth curve model, which is extensively applied to biostatistics, medical research and others. It is shown that, under the normal assumption the maximum likelihood estimate ( abbr.
MLE ) of the unknown parameter matrix B, in general, is a nonlinear function of the response matrix Y, i.e., there are truly some differences between the growth curve model and the ordinary linear regression model. It is the reason that why it is necessary to study thoroughly the growth curve model so as to reveal its particularities. Kariya[4] proved that the following Rao's simple covariance structure ( abbr. RSS ): ~
= xrx" + W0W T
(1.3)
unifies several common estimates of the regression coefficient matrix B, where r > Uaud l ) ' f) are p xp and (q  p) x (q  p) unknown parameter matrices, respectively, W is a q x (q  p) kr. ",On design matrix suth that XTW
= 0 and rk(W) = q 
of the unknown covariance matrix
~,
p. Therefore, among all special struct u..
RSS (1.3) should be paid more special attentions.
For the growth curve model ( 1.1 ), under the normal assumption ( 1.2 ) and RSS (1.3), we first recound in this paper the definition of wmodel, the curvature measures for assessing the influence of the general perturbation on the unknown parameter matrices B, rand 0 are established. Second, the above results are applied to the common covarianceweighted perturbation scheme.
2
Definitions and Lemmas For the sake of concise, we introduce some definitions and lemmas here. Definition 2.1(Wei Bocheng et al[5]) The perturbation model satisfied the following
regulations is referred as wmodel. a. Suppose that the loglikelihood function of random matrix Y corresponding to the postulated model M is denoted by L(8), where 8 E 0 is an unknown distribution parameter vector and 0 C R" is open. b. Suppose that w = (Wl,WZ,'" ,wdT E n is a vector describing the perturbation factors, where n is an open subset of R t . Let M(w) stand for the perturbed model and L(8Iw) be the loglikelihood function ofY corresponding to M(w). Further, assume that L(8Iw) has the continuous partial derivatives of the second degree in 0 x n. c. There is a point Wo En such that M(wo) = M, thus, L(8Iwo) = L(8) for all 8 E 0. d. Suppose that 8 and 8(w) are the MLEs of 8 E 0 corresponding to M and M(w). respectively, so that 8(wo) = 8. For the wmodel in Definition 2.1. Cook[Z] proposed the following definition.
557
Bai: ASSESSMENT OF LOCAL INFLUENCE IN A GROWTH CURVE MODEL
No.5
Definition 2.2 If 0 can be partitioned as 0 = (of, O'f)T E 0 c R', where 01 E R'I is interested and O2 E R''I is superfluous, then the likelihood displacement on the interested parameter vector 0 1 is defined as follows:
LD,(w) where 8(w)
= 2[L(8) 
L(8(w»],
(2.1)
= (81(w)T,82(w)T)T(81(w) E R'I,82(w)
(B 1 (w)T ,82(81(w»T)T and 82(Od
=
E R''I),8 8(wo) (8'[,8i)T,8(w) = is the MLE of O2 with 01 given corresponding to the postulated
model M, i.e.
L((9T,8 2(91)T)T) = ~~L((9f,9i)T). It is clear that the likelihood displacement function z
(2.2)
= LD,(w) versus w ( socalled influ
ence graph ) contains the essential information about the influence of the minor perturbation scheme on the inference of 91 . It can be shown that z LD,(w) attains its minimum value o at Wo and its first derivatives along every direction at Wo vanishes[5]. Therefore, we choose
=
its second derivatives along every direction evaluated at wo, i.e., its curvatures along every direction evaluated at wo, to measure the sensitivities to the minor perturbation scheme. According to Definition 2.2, we easily establish the following lemma.
Lemma 2.1 The curvature along the direction a « R t evaluated at Wo of the influence graph z LD, (w) in Definition 2.2 can be expressed as follows:
=
c. = 21dT Fdl,
(2.3)
_.rlI'T_OOdW)1 _ ( 1'1 ) where F  l7 D. LD.G, G  tt;;;r w=wo' D. O~J;d I
.
and
11 1=11 1
The above lemma shows that, as a function of the direction d E R t , Cd is continuous
in the close subset of m, D = {d E R t : ~d = I}, hence, Cd attains its maximum value max O« in D. It is obvious that the direction d m ax E D corresponding to C llla x is the Cllla "
= dED
unit eigenvector corresponding to the largest absolute eigenvalue C lllax of
F,
which indicates
how to perturb the postulated model M and to attain the greatest local change in the likelihood displacement LD,(Od. Therefore, G.max is a statistic we concern mainly in the context of the local influence analysis. For the sake of convenience, unless stated otherwise, the matrices F, G, D. and L corresponding to the interested parameter vector 01 in Lemma 2.1 are simplied as Fill' Gill' D.II I and LIII , respectively. The following lemma establishes an important property for the curvature Cd in Lemma 2.1. Lemma 2.2
Suppose that 11
= f(Od is an
11 measurable transformation from 0
91 E 0 1 and 11 E H, then FI/ I = F'I' As space is limited, the proofs of Lemmas 2.1 and 2.2 are omitted.
1
to H,
It is remarked that Lemma 2.2 implies F defined in Lemma 2.1 and the direction dm ax corresponding to the largest absolute eigenvalue of F are invariant under an 11 measurable transformation of the interested parameter vector 01 ,
558
3
ACTA MATHEMATICA SCIENTIA
Vol.19
Application to Growth Curve Model
Under the normal assumption ( 1.2) and RSS ( 1.3 ), we apply the above results to discuss the growth curve model ( 1.1 ) here. Theorem 3.1 If B is interested and of the txt matrix FB is
where
.
Ie.,
(r 1 , e 1 )
is superfluous, then the (i, j )th element
Tl T . T = ntr[B'Ti X T X(X SX) X X BjZ Z]'
Bi = a:~~t =",o ,iJ(w) = [X(w)TX(w)t1X(w)TY(w)Z(w)[Z(w)TZ(w)tl
(3.1) is the MLE
of B corresponding to the perturbed model M(w), X(w), Y(w).and Z(w) are the matrices X, Y and Z corresponding to M (w), and S = Y (In  Pz)yT, Pz = Z (ZT Z) 1 ZT is the orthogonal projection matrix of Z, respectively. Proof Under the normal assumption ( 1.2 ) and RSS ( 1.3 ), the loglikelihood function of the response matrix Y in the growth curve model ( 1.1 ) can be written as
L(B, r 1, e 1) = ~qnln(27r)  iUn(IX TXI) + In(\WTW!)  In(lr I i)  In(le 1!)]
~tr{(Y 2
 X BZT)T[X(X T X)lr1(X TX)l X T
(3.2)
+W(WTW)le1(WTW)lWT](y  XBZ T)}.
It follows from ( 3.2 ) that the MLE of the interested parameter matrix B corresponding
to the perturbed model M(w) of the growth curvemodel ( 1.1 ) is (3.3)
For the model ( 1.1 ), the MLEs of
r 1 and e 1 with B
known are given by, respectively,
feB) = ..!.(X T X)l XT(y  X BZT)(y  XBZT)T X(X T X)l n
and G(B)
= ..!.(WTW)lWTyyTW(WTW)l. n
From ( 3.3 ), we know that the i
where
s, =
a:J:l
«i
(3.4)
(3.5)
l)m + j, k)th element of the (pm) x t matrix G B is
= 1""
,p, j
= 1"", m, k = 1,···, t,
(3.6)
1...="'0' Ii E RP(gj E RID) is a vector whose the ith ( jth ) element is one
and the others are zeroes. Further, from ( 3.4 ) and ( 3.5 ), we know the [pm + ~p(p+ 1) + ~(q  p)(q  p+ 1)] x (pm) matrix !:i. B can be partitioned as Ipm )
1':1
B
=( n
.
(3.7)
Bai: ASSESSMENT OF LOCAL INFLUENCE IN A GROWTH CURVE MODEL
No.5
559
It follows from ( 3.2 ) that the [pm + ~p(p+ 1) + ~(q  p)(q  p+ 1)] x [pm + ~p(p+ 1) + ~(q  p)(q  p + 1)] matrix LB can be expressed as
(3.8)
where
Ln
is a (pm) x (pm) matrix whose the (Ci  1)m + i. (k  1)m + l)th element is T  f; r A
Ln

1
f kgjTZTZ g"
i,k
= 1,···,p,j,1 = 1,···,1n,
is a ~p(p + 1) x ~p(p + 1) matrix whose the (~(2p  i)(i  1) + j, ~(2p  k)(k  1) + l)th
element is 1
1
T
T
A
n(1 26;j)(1 26kl)f; r(fkf, and
L33
T
+ fdk
)rfj, 1 ~'i ~ j ~ p, 1 ~ k ~ 1 ~ P A
is a t(q  p)(q  p+ 1) x ~(q  p)(q  p+ 1) matrix whose the (~(2(q  p)  i)(i 1) +
i. ~(2(q 
p)  k)(k  1) + l)th element is
respectively. Where!'
= !'(wo), 0 = 0(wo), 6;j is the
Kronecter sign and li, E
Rqp
is a vector
whose the ith element is one and the others are zeroes. According to ( 3.6 ), ( 3.7 ) and ( 3.8 ), we obtain
(3.9) whose the (i, j)th element is [e.,
=
p
m
p
m
" " " " T· LJLJLJL./a BjgbfaT
a=lb=lc=ld=l
r
A_1
fcgbT Z T ZgdfcT·Bjgd
= ntr[B; X T X(X T SX)l X T X BjZT Z]. Therefore, Theorem 3.1 holds and the proof is complete. Theorem 3.2 If r 1 and 0 1 are interested, respectively, then the (i,j)th element of the txt matrix
Fr 
1
is
(3.10)
and the (i,j)th element of the txt matrix
Fel
is (3.11)
Further,
Fr = Frl,
Fe
= FeI,
(3.12)
i: = a~~~tJ=wo' 0; = a~l~)L=wo' !'(w) = ~(X(w)TX(w))lX(w)TS(w)X(w)(X(w)T .X(w))l and 0(w) = ~(W(w)TW(w))lW(w)TY(w)Y(w)TW(w)(W(w)TW(w))l are the
where
560
ACTA MATHEMATICA SCIENTIA
Vo1.19
MLEs of I' and 8 corresponding to M(w), r = r(wo), 0 = 0(wo), Sew) = Y(w)(In PZ(w»)Y(w)T is the matrix S corresponding to M(w), respectively. Proof If r 1 is interested, then it follows from (3.2) that the MLE of r 1 corresponding to the perturbed model M(w) ofthe growth curve model (1.1) is given by (3.13)
For the model ( 1.1 ), the MLEs of B and 8 1 with follows, respectively,
r 1
known can be expressed as
(3.14) and
0(r 1 )
= 2:.(WTW)1WT y y TW{W TW)1.
(3.15)
n
From ( 3.13 ), we know that the (~(2p  i)(i  1) + i. k)th element of the ~p(p + 1) x t matrix G r  1 is _j,:rf 1 f1f· 1 ~ i ~ j ~ p, k = 1, ... , t, (3.16) where
r", = •
ar(w) aw
I
r
•
" W=""o
'"
J'
•
It follows from (3.14) and (3.15) that the [pm+~p(p+1)+ t(qp)(qp+1)] x tp(p+1)
matrix Llr, can be partitioned as
Llr, =
I!p(p+1») 0
(
(3.17)
o
and the [pm + ~p(p+ 1) +!(q  p)(q  p+ 1)] x [pm + ~p(p+ 1) + ~(q  p)(q  p+ 1)] matrix Fr, can be written as
£r ' =
(L:' t:, _: ). o
(3.18)
0 L 33
From ( 3.16 ), ( 3.17) and ( 3.18 ), we have (3.19)
whose the (i, j)th element is f r i/
~
= :n
LJ
~
LJ
1
1
T~ 1· ~
(1  26ab)(1  26ed)fa
r ri r
1
T~
fbfa
r
1~a~b~p1~c~d~p
T
·Ucfd n
+ fdfeT )r fbfe r 1·r j r 1 fd ~
T~
~
= "2tr(rir rjr ). ·~1·~1
Therefore, ( 3.10 ) holds. Similarly, we have (3.20)
No.5
Bai: ASSESSMENT OF LOCAL INFLUENCE IN A GROWTH CURVE MODEL
561
whose the (i,j)th element is given by ( 3.11 ). Where Gel is a ~(q  p)(q  p+ 1) x t matrix whose the (~(2(q  p)  iKi  1) + j, k)th element is 1~ i
~
j
~
q  p, k
= 1.... , t,
e. = 8:J~) L=wo and 8(w) = ~(W(w)TW(W))lW(w)TY(w)Y(w)TW(w)(W(w)TW(w))l. Further, it follows from Lemma 2.2 that ( 3.12) is true and the proof is complete. Theorem 3.3 If (r1,e 1) (or ~l ) is interested and B is superfluous, then (3.21 ) where the (i,j)th element of the txt matrix Frl ( FeI) is given by ( 3.10) ( ( 3.11 )). Proof If (r 1 , e 1 ) is interested, then it follows from ( 3.2 ) that the MLE of B with (rl, e 1 ) known is B(r 1, e 1 ) = (X T X)1X TYZ(ZT Z)1. (3.22) The ~[P(p+ 1) + (q  p)(q  p+ 1)] x t matrix G(r1,e1) can be expressed as (3.23) From ( 3.22), we know that the (pm + ~p(p+ 1) + ~(q  p)(q  p+ 1)] x ~[P(p+ 1) + (qp)(q  p+ 1)] matrix ~(rl,el) can be partitioned as
(3.24)
and the (pm + ~p(p+ 1) + ~(q  p)(q  p+ 1)] x [pm + !p(p+ 1) + !(q  p)(q  p+ 1)] matrix L(r1,e1) can be written as
(3.25)
From ( 3.23 ), ( 3.24 ) and ( 3.25 ), we have (3.26)
It follows from ( 3.19 ), ( 3.20 ), ( 3.26 ) and Lemma 2.2 that ( 3.21 ) holds and the proof is complete. Theorem 3.4 When all parameter matrices B, r 1 and e 1 are simultaneously of interests, we have
562
ACTA MATHEMATICA SCIENTIA
Vol.19
where the (i,j)th element element of the txt matrix FB Frl. FeI) is given by (3.1) «3.10),(3.11) ). The proof of Theorem 3.4 is similar to that of Theorem 3.3 and omitted. The above theorems indicate that, for the growth curve model ( 1.1 ), under the normal assumption ( 1.2 ) and RSS ( 1.3 ), the txt matrices F corresponding to the interested parameter matrices defined in Lemma 2.1 only related to the first derivatives' of the MLEs of the interested parameter matrices on the perturbation factor w. Next, we focus our attention on assessing the local influences of some special perturbation schemes on the growth curve model ( 1.1 ) under the normal assumption ( 1.2 ) and R$S ( 1.3 ). Without loss of generality, we consider only the covarianceweighted perturbation scheme, which is made up of the following form E
~
Nqxn(O, 0
e E),
(3.28)
where 0 =diag(wl,"',wn)(wi > O,i = 1,···,n), W = (Wl,"',w,,)T E R". It is equivalent to
(3.29) where 0 1/2 = diag(w~1/2, ... ,w;1/2). Obviously, Wo = (1"", I)T E R" stands for that there are no any perturbation in the model ( 3.28 ) or ( 3.29 ). For the covarianceweighted perturbation model ( 3.29 ) with RSS ( 1.3 ), the MLEs of the unknown parameter matrices
B,
r
and 0 can be simplied as follows:
B(w) = (X T X)l XTYOl Z(ZTOl Z)l,
(3.30 )
f(w) = .!.(XTX)lXTS(w)X(XTX)l
(3.31)
n
and
0(w)
= .!.(WTW)lWTYOlyTW(WTW)l,
n where S(w) = Y(Ol  OlZ(ZTOIZ)lZTOl)yT.
(3.32)
Note that
B; I', =
= a:2~l=". =_(XTX)~'XTY(I. 
{)~(W)I VWi
w=wo
and
e, =
pz)d;df
Z(ZTZ)~',
= _.!.(xTX)lXTY(I"  Pz)d;d;(I" _ Pz)yTX(XTX)l n
()~(w) I VWj
W=""o
= _.!.(WTW)lWTYdjd;yTW(WTW)l, n
(3.33)
(3.34)
(3.35)
where d j E R" is a vector whose the (i, i)th element is one and the others are zeroes. Furthermore, the main results in this paper can be established as follows.
Theorem 3.5 Under the normal assumption ( 1.2 ) and RSS ( 1.3 ), for the growth curve model ( 1.1 ) with the covarianceweighted perturbation scheme, if n > m + q, then the txt matrices FB, Frl and Fel in Theorems 3.1 and 3.2 can be expressed as (3.36)
No.5
Bai: ASSESSMENT OF LOCAL INFLUENCE IN A GROWTH CURVE MODEL
Frl
n = "2(In 
T
563
TIT
PZ)Y X(X SX) X Y(I1i  PZ)
. 0 (In  PZ)y T X(X TSX)IXTY(I1i  PZ)
(3.37)
and (3.38) respectively, where Al 0 A z denotes the Hadamard product of two matrices Al and A z, i.e., if Al (alij)mxn and A z (aZij)mxll' then A 10A z (alijaZij)mxll' The proof of Theorem 3.5 follows directly from Theorems 3.1, 3.2 and ( 3.33 ), ( 3.34 )
=
=
=
and ( 3.35 ). It is remarked that Theorems 3.3, 3.4 and 3.5 show, under the normal assumption ( 1.2 ) and RSS ( 1.3 ), for the growth curve model ( 1.1 ) with the covarianceweighted perturbation scheme, the curvatures corresponding to the interested parameter matrices, which are used to measure the local influence of the minor perturbation on the statistical inference, are only depended on the orthogonal projection matrices of the matrices Z, (Ill  Pz )y T X and yTW. It is worth noting that, since XTW 0, rk(X) p and rk(W) q  p, hence
=
=
=
y TW(WTyyTW)IWTy
= yT {(yyT)1 _ (yyT)1 X[XT(yyT)IXt1 XT(yyT)I}y
(3.39)
which implies the left hand of the equality ( 3.39 ) is independent of the choice of the matrix W, so does Fel in (3.38). The main results in this section ( see Theorems 3.3 to 3.5 ) show how to calculate the matrix F corresponding to interested parameter matrix in a covarianceweighted growth curve model with RSS. Consequently, we need to calculate continuously the unit eigenvector corresponding to the largest absolute eigenvalue of F, which indicates the most sensitive direction of the growth curve model with RSS for covarianceweighted perturbation. References 1 Hampel F R. The influence curve and its role in robust estimation. JASA, 1974, 69: 383394 2 Cook P D. Assessment of local influence. J R Statist Soc, 1986, B48: 133169 3 Potthoff R F, Roy S N. A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika, 1964, 51:313326 4 Kariya T. Testing in the multivariate general linear model. Tokyo: Kinokuniya Comp , 1985 5 Wei Bocheng, Lu Guobing, Shi Jianqing. An Introduction to Statistical Diagnotics (in Chinese). Nanjing: Dongnan University Press, 1991.