Journal
of Statistical
Planning
and Inference
69
18 (1988) 6981
NorthHolland
OPTIMAL BAYESIAN EFFECTIVE DOSE M.K.
ESTIMATION
OF THE MEDIAN
KHAN
Department of Mathematical Sciences, Kent State University, Kent, OH 44242, USA Received
6 June
Recommended
1986; revised manuscript
Abstract: This paper quanta1
response
Fisher
information
positive
received
22 December
1986
by S. Zacks
function
provides
bioassay,
some properties
attribute
function
(arising
and dilution
in Probit,
of order 2 and consequently
as well as a function
of the Fisher
life testing
Logit
unimodal
of the design level. A method
information
function
assay models.
and extreme
value
as a function
of the unknown
is provided
for obtaining
arising
It is shown models)
in
that the
is a totally parameter
estimates
of the shift
parameter (median effective dose when scale parameter is known for symmetric tolerance densities) in these models from Bayesian and adaptive Bayesian points of view. We assume that the prior
distribution
belongs
to a class containing
line besides the rectangular Fisher information under certain the median
densities.
function.
conditions effective
Polya
Optimality
densities
is defined
It is shown that the optimal
Bayesian
on the model and the prior distributions.
AMS Subject Classification: Primary Key words and phrases: Attribute bioassays;
2 defined
estimates Optimal
over the real the expected
exist and are unique Bayesian
estimates
dose for a two stage design set up for Logit model are provided
prior distributions. It is shown that the tables of optimal uniform prior can be used for logistic prior as well.
response
of order
in terms of maximizing
62K99;
Secondary
life testing;
dilution
Bayesian
estimates
obtained
of
for some for the
62F15. assays;
Fisher
information;
quanta1
total positivity.
1. Introduction In quanta1 response bioassay (QRB), dilution assays (DA) and attribute life testing (ALT) models the only observable random variable is the number of responses, J(X), having a binomial distribution with parameters N, F(x, 0) =p, where F is assumed to be a known cumulative distribution function (called the tolerance distribution) and 8 is an unknown parameter which needs to be estimated. The variable x is under the control of the experimenter  called a design level and usually represents the log of the dose administered in QRB, log of the amount of dilution in DA and the log of time of inspection in ALT. In dilution assays x represents the natural log of the amount of dilution and 8 is related to the parameter of interest, namely, the concentration of the virus (or other organism under investigation) in the 0378.3758/88/$3.50
0
1988, Elsevier
Science Publishers
B.V. (NorthHolland)
70
M.K. Khan / Optimal Bayesian estimation
solution or sample. N represents the total number of identical units over which the experiment was observed and is also specified before the experiment is conducted. Essentially, QRB differs from both ALT and DA due to different choice of F. In QRB, F is usually assumed to be the normal, logistic or some other symmetric cumulative distribution function (CDF) (Finney (1978), Cox (1970)). In ALT and DA, F could be taken as the extreme value distribution (Fraser (1979), Fisher (1922)). The parameter 19is a pair (/3e, &) such that F(x, 19)= F(po+/31x). In DA, j3, is taken to be 1. Marks (1962) and Freeman (1970) studied QRB models and assumed that fir is a known constant. We will assume this as well throughout this paper. The aim is to estimate the median effective dose LD,,, i.e., F(LD,,, 0) = + where 8 is &, and pi = 1. Marks (1962) considered the Probit model and assumed that &, had a normal prior distribution while Freeman (1970) used the Logit model and assumed that &, had a logistic prior distribution. We will keep the class of prior distributions and the choice of F to contain the above mentioned distributions. In the next section, some properties of the Fisher information function are obtained when the parameter of interest is the shift parameter. A necessary and sufficient condition (for symmetric tolerance distributions) is provided under which the Fisher information function has a finite global point of maximum when considered as a function of the design level. The use of the totally positive functions of order 2 (TP2) [and their sub class called Polya functions of order 2 (PF2)] in reliability theory (Barlow and Proschan (1975), Barlow and Marshall (1964)), Economics (Karlin (1959)), approximation theory and many other branches of mathematics (Karlin (1968)) is well known. In Section 2 it is shown that the Fisher information functions associated with QRB, DA and ALT models are TP2 as a function of the parameter and the design level. This, besides other aspects, shows that the Fisher information function (as a function of the design level) is unimodal. In Section 3 the optimality criterion is defined. In Section 4 the problem of design of experiment is discussed. It is shown that when a prior knowledge of the parameter is available, a one point Bayesian design provides more expected information than a sequence of ad hoc designs for the same number of experimental units. Furthermore, we provide the existence and uniqueness results of optimal Bayesian estimates. In Section 5, a two stage adaptive design is obtained whose first stage coincides with the optimal Bayesian experimental design. Some examples are provided for the Logit model when the prior distribution is uniform and logistic density respectively. It is shown that one can use the table of optimal estimates corresponding to the uniform density for the logistic prior as well. Finally, a short discussion is provided in the last section.
2. Properties
of the Fisher information
function
Maximization of the Fisher information function is often used as the criterion of optimality for nonlinear models (Chernoff (1951) and Zacks (1977)). In the fol
M.K. Khan / Optimal Bayesian estimation
lowing
we define the optimality
criterion.
Since we assume that /3i is a known
11
con
stant,wecantakeittobel.AnddefineF(x,8)=F(x8);oo
 e)( 1  F(X  e))].
(1)
One example of the maximization of Z(e, x) in ALT models (for the case of negative exponential distribution) was considered by Zacks (1977) in relation to applications in reliability theory. It was assumed that B (8 being the scale parameter) had a gamma prior distribution. Khan (1984) considered the same problem of maximization of Z(e, x) from a nonBayesian point of view. Note that due to our assumptions that the density f(u) is continuously differentiable and unimodal, Z(u)+0 as The following lemma characterizes the unimodality of the Z(x 0) as a IxI+m. function of x (and fixed t9) for the symmetric tolerance densities. Lemma 1. Let f(u) be a symmetric (about 0) and unimodal density. The Fisher information function Z(u) as given in (1) has a global point of maximum at u =0 if and only if
(F(u) +)2+(1
f2(u)/f2(0))
Vu>O.
(2)
Proof. Note that if F(u) is the CDF then by symmetry F(u)( 1 F(u)) = $  (F(u)  +)2. Now, Z(U)SZ(O) for all u>O if and only if +f 2(u>if2(0)F(u)(1 F(u)), which proves the lemma. Example 1. Logit and Probit (Finney (1978)). The Probit
models are two of the most widely used models in QRB model takes ‘11
F(u) = (27~~ 1’2
exp(+t2)dt,
03
Sm while the Logit model
assumes
F(u)=(l+exp(u)))‘,
that m
For a comparison of these and some other models see Finney (1978). For the Logit model it is trivial to show that the Fisher information function is unimodal. For the Probit model, the inequality in Lemma (1) can be used to prove the unimodality of the Fisher information function. Indeed (Johnson and Kotz (1970)), @(x)5 +[l + (1  exp(x2))“2] for all x, this implies the condition of Lemma 1.
72
A4.K. Khan / Optimal Bayesian estimation
To study the unimodality (not necessarily symmetric) Definition
1 (Karlin
variables ranging if for all x,
and other properties of the Fisher information function given in (1) we use the concept of TP2 functions.
(1968) p. 11). A real nonnegative
function
Z(X,y) of two
over linearly ordered sets X and Y, respectively, is said to be TP2 yl
Z(%,Y,)
4X,,Y,)
>.
4X29Y1)
Z(X29Y2)

*
An important specialization occurs if Z(x, y) can be written as Z(xy) where X and Y are both the real line. In this case Z(U) is called a PF2. Another characterization of PF2 functions, which is sometimes easier to verify for differentiable functions, is that log(Z(u)) be a concave function (see Barlow and Proschan (1975)). One of the properties of the PF2 functions is their unimodality. The following lemma will be needed in the sequel. Its proof follows by showing that log(Z(u)) is concave. Lemma 2. For the,Z(& x) as given in (I), where f(u) is the density, the following
two
statements are equivalent: (i) I(@ x) is TP2 as a function of x and 13. (ii) 2f’(x)/f(x) f(x)/F(x) + r (x ) IS ’ d ecreasing Vx, where r(x) is the failure rate associated with the tolerance density f(x), and f’(x) is the first derivative of f(x). Example 2. The Fisher information functions associated with the Logit and Probit models are TP2. It is simple to show that the Fisher information function for the Logit model is TP2. In order to show that the same is true for the Probit model, let r(x) [email protected](x)/(l  Q(x)) be the failure rate of the standard normal density. It is proved by Sampford (1953) that the failure rate is convex and 0
of the important
results
related
to the TP2
functions
is their
diminishing property (VDP) (Karlin (1968) pp. 2022). We use this property tion 4 to show that the optimal Bayesian estimates exist and are unique.
variation in Sec
(1975) p. 93). Let K(x, y) be a TP2 function defined on the Euclidean plane. Let h(x) be a bounded and Bore1 measurable function on the real line. Let the transformation Lemma 3 (Barlow and Proschan
co H(x) = .rm
K(x, y)h(y) dy
be finite for each x in (
03,
00). Then the number of sign changes of H is less than
M.K. Khan / Optimal Bayesian estimation
or equal to the of h is at most number of sign their respective
13
number of sign changes of h provided the number of sign changes one. Moreover, if the number of sign changes of h is equal to the changes of H, then h and H exhibit the same sequence of signs when arguments traverse the domain of definition from left to right.
In the following the nonexistence of the MLE is discussed. In classical statistical analysis, if we observe responses Ji at the design level Xi from Ni, i= 1,2, . . . , n, units respectively, then the MLE of 0 is the solution of the likelihood equation $, f(Xie)(JiNiF(XiB))/[F(Xi8)(1F(Xi8))]=0 where, f is the tolerance density. Let B be the union of the events { C Ji = 0} and w h ere M= C N, is fixed. Clearly, the probability of the event B is {CJi=M} fi
FN’(Xi

8) +
i=l
ie,
(1  F(xi 
e)>N’
Now it is simple to see that the likelihood equation has no finite solution over the event B. We should add that the nonexistence of the MLE may be of minor practical importance since P(B) is rather small for any reasonably chosen design levels when some prior
3. Optimality
knowledge
criterion
of B is available.
and experimental
designs
An experimenter faces two problems for estimating B or any function of it. First problem is the design of experiment when some prior knowledge is available. The second problem is how to estimate the parameter when an experiment has been conducted over some design levels. In the following we study how the amount of Fisher information is affected by the choices of the number of design levels n, the design vector x=(x,, . . . . x,,) and the allocation vector N=(N,, . . ..N.,), when M=N,+~~~+N, is fixed. Clearly, the Fisher information function varies with the parameter 8. Therefore, we overcome this problem by either considering locally optimal vectors (i.e., design vectors which are optimal in a neighborhood of 0) or from Bayesian point of view. units to be used to Let x1, . . . . x, be the design levels with N,, . . . , N, experimental obtain J,, . . . , J, responses respectively. Let Z(B 1n, x, A’) be the total Fisher information associated with the experiment. By independence, Z(e 1n, x, IV) = Cl=, Z(e, Xi) where Z(0, Xi) is given in equation (1) in which N is replaced by N,. Theorem 1 in the next section indicates, when a prior knowledge of the parameter is available, it is better to use one ‘good’ Bayesian design level than to use many ad hoc design levels which might be far away from the optimal design. However, experimenters prefer using more than one design levels due to incomplete prior knowledge. Section 5 shows how one could use adaptive designs in such situations.
14
M.K.
Khan
/ Optimal
Bayesian
estimation
Now we consider estimation from a fixed sample point of view. After performing n experiments at a set of predetermined design levels x1, . . . , x,, (n 2 l), we would like to estimate 8. By Lemma 1, for symmetric densities, the LD,, is the point of maximum of the Fisher information function as well. The point of maximum of the Fisher information function for nonsymmetric tolerance densities is different from the LD,,,. However, due to the simple relationship between the percentiles of a distribution, one can use the estimate of one to estimate the other. Indeed, let the point of maximum of Z(U) and d be a real number such that F(d) =p. Then most informative point for estimating 19is x* = c + f3 and LDP = d + 8. Our aim use the most informative design levels to estimate LDP, where p is not close or 1. Definition
2. Let .Z,, Jz, . . . , J,, be the observed
x1,x2,**., x,, respectively. Bayesian (OB) if
An estimate
E{Z(O, 2)] rE{Z(O,x)}
number
of x* (denoted
for  00
03,
c be the is to to 0
of responses
at design levels
by 2) is defined
to be optimal
n =0, 1, . . . ,
(3)
where the expectations are taken with respect to the posterior distribution (prior distribution if n =0) of 0 given the J1, J2, . . . , J, and xl, x2, . . . , x,,. Once x* is estimated by .?‘, the OB estimate of LDP is given by d  c +k
4. Main results The following theorem describes expected Fisher information.
the Bayesian
design which maximizes
the total
Theorem 1. Let Z(u) be a continuously differentiable unimodal function representing the Fisher information function (l), and let g be a PF2 density of 0. Then the point of maximum of E(Z(O / n, x, N)) is obtained for n = 1, N, =M, x=2, where 2 is the unique solution of the equation E((a/ax)z(e,
x)) = 0
and the expectation is with respect to the distribution of 0. Proof. Since E(Z(x 0)) is a continuous and bounded function it has at least one point of maximum. Clearly, 0 as lxl’oo,
E(Z(O 1n, x, N)) = i N; m G(X; i=l d0D
of x converging
to
e)g(e)de,
where G(xj  0) represents Z(x;  e)/Ni. Now, note that due to the fact that Z(u) is a continuously differentiable and unimodal function, IZ'(u) 1+O and as / u[ 00. This implies that Z’(U) is bounded. Hence, by the Mtest for uniform convergence
M.K.
of the integral
Khan
/ Optimal
we can interchange
(a/ax)E(G(x
Bayesian
the derivative
75
estimation
and the integral,
i.e.,
0)) = ‘m (J/ax) G(x  @g(B) de. I to
By the fact that Z(U) is unimodal, G’ has exactly one sign change and is from positive to negative. Since g is PF2, by Lemma (3), the number of sign changes of ‘m
103 G’(u)g(x;
 U) du =
G’(x; [email protected](0)
d0
1_a
I_,
is one and is from positive to negative. Therefore, E(Z(O,x)) has exactly one point of maximum and is achieved at the unique point A?given by (4). Define a random variable U taking values 1, . . . , n with probabilities Nj/it4, i = 1, . . . , n, respectively. Now,
ace
lc0
E(Z(O j n, x, N))/M
=E
G(x,
tV)g(B) d0 I
I\ 02
I! co
where the expectation is with respect to the discrete random by M proves the theorem.
G(g
variable
e)s(Q
de,
CT.Multiplying
Remark. Note that for the Logit model Z(u) =f(u) and therefore, Z(U) itself is a PF2 density function. For such models when Z(U) is an integrable PF2 function, E(Z(O,x)) itself becomes a PF2 function. This follows by a convolution property of PF2 densities (Karlin (1968) pp. 332333). Indeed, if X, Y are two independent random variables with PF2 densities f and g defined over the real line respectively, then Z = X+ Y has a PF2 density as well. Hence, for integrable unimodal PF2 Z(U) (such as Probit and Logit models), E(Z(O, x)) itself is a PF2 function when g is a PF2 density. This would give a much simpler proof as to why (4) has a unique root. In order to characterize the fixed sample OB estimates we proceed as follows. Let design j=(j,,j,, . . . . j,) be the observed number of responses at the predetermined levelsx=(x,,x, ,..., x,)out ofN,,N, ,..., N, units respectively. The posterior distribution 0 given j,x is: g,(e)
= K(x, j) i
i=l
where K(x, j) does not depend the prior density g.
P(x,
e)(i F(x;
 ep$(e)
on 8. For n = 0, the posterior
(3 density
(5) reduces
to
Theorem 2. Let Z(B,x) be as given in (1) such that Z(u) is unimodal. Zf the prior density g(Q), the tolerance CDF F(u) and 1 F(u) are PF2 then the OB estimate of x* is the unique solution of the equation
E((~?/ax)Z(@,x)) = 0,
(6)
where the expectation is taken with respect to the posterior density given in (5).
M.K. Khan / Optimal Bayesian estimation
16
Proof. @1(0,x)) = c j_“, 1(x @g,(O) de, where c is bounded, by the same argument as given in the at least one point of maximum of E(Z(O,x)). Now gral sign (which is justified by the same argument change of variable we have
does not depend on x. Since Z(u) proof of Theorem 1, there exists by differentiating under the inteas provided in Theorem 1) and
>oo (a/l3x)E(z(O,x))
= c
Z’(y)g,(x_v)
(7)
dy.
1 to Since the product of PF2 functions is PF2, g, is a PF2 and consequently g,(xy) is a TP2 as a function of x, y. By Lemma 3, the number of sign changes of the left hand side in (7) is less than or equal to the number of sign changes of Z’(U). Due to the unimodality of Z(U), the number of sign changes of Z’(U) is one and from positive to negative. Therefore, by Lemma 3 the sign of the left side in (7) changes from positive to negative, which proves the theorem. Corollary. As special cases, we see that the OB estimates are characterized by (6) for Probit, Logit, Extreme models.
5. Adaptive
designs
In the following
we outline
a two stage design
of experiment.
First stage design Although Theorem 2 characterizes the first stage design (for n = 0), the characterization requires less assumptions. In fact for symmetric densities, one does not need the PF2 properties as shown in Theorem 3 below. Let 0 represent the random variable having the prior density g(0). The first stage optimal Bayesian estimate (OBl) of 19 is a real number x which maximizes cc
E(Z(O, x)) =
(8)
Z(& x)g(@ de.
To motivate the concepts, lets assume that 0 has a uniform prior distribution over the interval [a, b]. If Z(0,x) follows the conditions of Lemma 1 and decreasing for x>0, (8) reduces to xa
E(Z(O, x)) = N/(b  a) i xb
f 2(y)/[F(y)(l
F(Y))]
dy
ba/2 I
N/(b  a)
.i ba/2 That is, OBl estimate
of 6’is the median
f 2(y)/[F(y)(l
F(Y))] dy
of the prior distribution.
For general
prior
II
M.K. Khan / Optimal Bay&an estimation
distributions 0f
we have the following
two theorems
to characterize
the OBl estimates
e.
Theorem 3. Let Z(t?,x) be the Fisher information function
as given by (1) such that I(& x) is decreasing for x> 8, and f (x) is symmetric about 0. Zf 0 has a prior density g(0  A) decreasing for 0 > ,I, where g(u) is symmetric about 0, then the OB 1 estimate of 8 is 8, =A. Proof.
Without
loss of generality we can assume A= 0. For a fixed x> 0, define h(&x)=Z(e,O)Z(B,x). For anyy>O, h(+xy,x)=h(+x+y,x). Therefore, cc
‘co
I
h(e,x)g(e)
de
< cc
=
I
h(~xy,x)[g(3xy)g(tx+y)l
dy.
(9)
.O
We will show that the integrand on the right in (9) is a product of two nonnegative functions. Since, x>O, y>O, therefore 0< I+xyi <+x+y. By the symmetry of g and the fact that g(u) is decreasing for u > 0, g(+x y)  g(+x+y) > 0 Vy > 0, where x>O is fixed. Similarly, h(+xy,x) =Z(+xy)Z(+x+y) and by similar reasons h(+xy,x)>O for all y>O, where x>O is fixed. Therefore E(h(O,x))>O Vx>O. Hence, by symmetry, E(Z(O,O))~E(Z(O,x)) VX, which proves the theorem. Note that in Theorem 3 as one would expect, .Ct = A is a solution of the equation E((8/ax)Z(O,x)) = 0. For the cases when the tolerance density or the prior density are not necessarily symmetric we can use the following theorem. Theorem
4. The OBl
estimate of x* is the unique solution of the equation
E((a/ax)z(O,x))
= 0,
(10)
where the expectation is taken with respect to the prior density of 0, provided either (i) or (ii) hold, where (i) Z(x 8) = Z(u) where x t9= u is such that Z(u) is unimodal and 0 has a PF2 prior density g. (ii) Z(x  8) = Z(u) where x  13= u is such that Z(u) is an integrable PF2 function and 0 has a continuously differentiable unimodalprior density g(0  A) =g(t) where 0  A= t is such that g(t) obtains its mode at t = 0. Proof. to the Now be the
The proof of(i) follows from Theorem 2, since the posterior density reduces prior density. to prove (ii), without loss of generality we can assume that A =O. Let g’(e) derivative of g(e). Since ‘rn
(a/ax)E(z(O,
x)) =
(\ co
Z(X 
e) g’(e) de,
78
M.K. Khan / Optimal Bayesian estimation
by similar argument theorem follows.
as given after
(7) and interchanging
the role of Z and g’ the
Second stage design The most informative use its estimate
design level to use is the x* itself. Since, it is unknown,
obtained
from the first stage, namely
we
2,) as the design level for the
next experiment. After performing the experiment at the first design level we observe .Z(,?t) =j. Clearly, given f3, J(i,) is a binomial random variable with parameters Nt , F(_2,  0). The posterior density of 0 given ,i?t and .Z(_?,)=j is g,(e
1 j,a,)
= zqi,,.j)(~(.q
 e))j(i
4y.2,
 epg(e)
(11)
where K(i,, j) does not depend on 8. The second stage optimal Bayesian (OB2) estimate of 6’is defined to be a real number which maximizes the posterior expected Fisher information function given 2, and J(g,) =j. Theorem 2 provides the 0B2 estimate of x* as well, since the posterior density is of the same form as (5) where xi is replaced by 2,. We will use the notation &(j,N) to represent the OB2 estimate of x*. Clearly, when f and g are symmetric and N is even in Theorem 2, we have _x?~(+N,N) = ,i?, =A, where L is the median of g. Furthermore, intuitively one expects that the following should hold in general: m<_?~(N,N)I_?*(Nl,N)I.**1.?~(0,N)<03.
(12)
The finiteness of all the _i?zis a consequence of Theorem 2. Numerical calculations for the examples given below do satisfy (12) in which we consider the uniform and logistic prior densities for the Logit model. However, we do not have a proof (or counter example) for (12). Example 3 (Logit model, Llniform prior). For tne Logit model, let 0 have a uniform density over (b, b). Then &(j,N) is the point of maximum of (13) By numerical integration (Shampine (1973)) we can obtain the values of the OB2 estimates. Table 1 and Table 2 provide the values of &(j, N) for N= 1,2, . . . ,8 and j=O, 1, . . . . [+N] when b= 10,20. Example 4 (Logit model, Logistic prior). Due to a simple relationship between the uniform and logistic prior models, one does not need to provide new tables to obtain &(j,N) for the logistic prior. Indeed, for the logistic prior we have, E(Z(O,x))
= bum lim ./~I.z(re)(~)+2eo(i+1)de.
(14)
79
M.K. Khan / Optimal Bayesian estimation Table
1
Logit
model
N
and uniform
0
(10;
10) prior
1
2
1
5.68
2
6.00
0.00
3 4
6.20
0.84
6.30
1.30
0.00
5
6.40
1.60
0.46
6 7
6.50 6.56
1.86
0.78
0.00
2.06
0.32
8
6.62
2.36
1.02 1.38
0.76
Therefore, from (13) and (14) we conclude that if &Jj, estimate of 8 for the uniform (6,6) prior and 12,,,(j,N) mate of 19for the logistic prior then for large b, %,,(j,Jv As Tables
= $,&+
1 and 2 indicate
4
3
0.00
N) represents the OB2 represents the OB2 esti
l,N+2). the equality
(15) in (15) holds
for br 10.
6. Discussion One can use classical estimation techniques, such as maximum likelihood method (Robbins, Monro (1951), or sequential estimation methods, e.g., RobbinsMonro Wetherill(1963), Wetherill(1975)), UpandDown method (Dixon and Mood (1948)) or SpearmanKarber type (Spearman (1908), Karber (1931), Brown (1959)) nonparametric estimation techniques. As is commonly the case in nonlinear models the maximum likelihood estimates can be biased and the bias can become serious if the
Table 2 Logit model
and uniform
N
0
(20; 20) prior
1
1
11.10
2
11.48
0.00
3 4
11.64 12.05
0.84 1.30
5 6 I
10.66
1.60
9.14
1.86 2.06
8
8.06 7.24
2.36
2
3
4
0.00 0.46 0.78
0.00
1.02
0.32
1.38
0.76
0.00
80
M.K. Khan / Optimal Bayesian estimation
design levels are chosen ‘far’ away from some central percentile such as LD,, for the symmetric tolerance densities. The sequential techniques have asymptotically optimal properties, however, for very small number of design levels their convergence is rather slow if the initial (starting value) design is not close to the ‘optimal’ design. In QRB, DA and ALT, it is common that the experimenter has some prior knowledge about 8 due to the past experiences. The amount of prior information can range from some specific probability density for 0 (this phenomenon seems unlikely) or just that 8 falls in some interval [a, 61 uniformly (this information is quite often available). One must add that when a design is planned for estimation purposes, either for maximum likelihood or for other methods of estimation, an intrinsic assumption is made about the location of the parameter to be estimated. It is well known that the MLEs and other estimators perform rather badly if the design span misses the LD,,. In fact, as was shown in Section 2, the MLE do not exist with positive probability for any design. From Bayesian point of view relatively few articles are available in the literature. Both Marks (1962) and Freeman (1970) consider minimization of an appropriate cost function as the criterion of optimality from Bayesian point of view. One of the problems (as far as their applications are concerned) has been the analytical interactability of the expressions involved after a few stages of the experiment. When the cost of the experiment or units is not of paramount concern one can use other functions for defining optimality. In this article we showed that the usual maximization of the Fisher information function as a criterion of optimality could also be used in QRB, DA and ALT. The technique, however, is general and could be modified for minimization of some risk functions as well. It was shown that with the use of Fisher information function, the problem is analytically relatively simple. The Fisher information function has some properties which should be of independent interest. We studied a general class of distributions (including the symmetric tolerance distributions such as Probit and Logit models). Our aim is to design an experiment for n design levels (n 2 1) to obtain an estimate of LD,, where 0
M.K. Khan / Optimal Bay&an estimation
81
Acknowledgement I would Rolletschek
like to thank Professors Phil Boland, Shelemyahu and the referee for their contributions to this paper.
Zacks
and
H.
References Barlow,
(1975). Statistical Theory of Reliability and Life Testing. Holt,
R.E. and F. Proschan
and Winston
B.W. (1959). Some Properties of the Spearman Estimator in Bioassay. Ph.D.
Brown,
of Minnesota, Chernoff,
Thesis, University
Minneapolis.
H. (1953). Locally
optimal
designs
for estimating
602. Cox, D.R. (1970). The Analysis of Binary Data. Chapman Dixon,
Rinehart
Inc. New York.
W.J. and A.M.
Mood (1948). A method
Ann. Math. Statist. 24, 586
parameters. and Hall,
for obtaining
London.
and analyzing
sensitivity
data.
J. Amer.
Statist. Assoc. 43, 109126. Fisher,
R.A. (1922). On the mathematical
foundations
of theoretical
Philos. Trans. Roy. Sot.
statistics.
222, 309368. Fraser,
(1979). Inference and Linear Models. McGrawHill,
D.A.S.
Freeman,
P.R. (1970). Optimal
Bayesian
sequential
estimation
New York.
of the median
dose. Biometrika
effective
57(l), 7989. Finney,
(1978). Statistical Methods in Biological Assays, 3rd ed. Charles
D.J.
Johnson,
N.L.
Houghton
Griffin,
London.
and S. Kota (1970). Distributions in Statistics: Continuous Univariate Distributions I.
Mifflin,
Boston,
MA.
Behandlung pharmakologischer Reihenversuche. Arch. Exp. Path. Pharmak. 162, 480487. Karlin, S. (1959). Mathematical Methods and Theory in Games, Programming and Economics. AddisonKarber,
G. (1931). Beitrag
Wesley, Karlin, Khan,
Cambridge,
S. (1968).
zur kollektiven
MA.
Total Positivity Vol. 1. Standford
M.K. (1984). Discrete
adaptive
University
design in attribute
Press,
Stanford,
CA.
Commun. Statist.  Theory Meth.
life testing.
13(12), 14231433. Marks,
B.L. (1962). Some optimal
quantile Robbins,
response
curve.
H. and S. Munro
Sampford,
sequential
schemes
for estimating
the mean of a cumulative
normal
J. Roy. Statist. Sot. 24(2), 393400. (1951). A stochastic
M.R. (1953). Some inequalities
approximation
method.
on Mill’s ratio and related
Ann. Math. Statist. 22,400407. Ann. Math. Statist. 24,
functions.
130132. Shampine,
L.F. and R. Allen (1973). Numerical Computing: An Introduction. Saunders,
Philadelphia,
PA. Spearman,
C. (1908).
The method
of ‘right and wrong’
(constant
stimuli)
without
Gauss’s
formulae.
Brit. J. Psychol. 2, 227242. Wetherill,
G.B. (1963). Sequential
estimation
of quanta1
response
curve.
J. Roy. Stat. Sot. Ser. B 25,
l48. Wetherill, G.B. (1975). Sequential Methods in Statistics. Chapman and Hall, London. Zacks, S. (1977). Problems and approaches in design of experiments for estimation and testing in nonlinear models. In: P.R. Krishnaiah, Ed. Multivariate Analysis IV. NorthHolland, Amsterdam, 209223.