Performance Evaluation 30 (1997) 87l 10
ELSEVIER
Approximate product form solutions for Markov chains * Johann Christoph Strelen ’ Rheinische FriedrichWilhelmsVniversitiit,
Bonn, Germuny
Abstract An approximation technique for finite state Markov chains is proposed. The solutions are in product from and can be calculated iteratively. This iteration is equivalent to disaggregating macro probabilities, multiplying with transition probabilities, and aggregating the obtained state probabilitieswe call it DA iteration. Large Markov chains with many states can be solved, so large that the state probabilities could neither be stored nor calculated in any computer. Accurate approximations are obtained, and moreover, the accuracy can be influenced: The more aggregates are considered, the better the approximations are, if the aggregates are chosen suitably. Practical experience is encouraging: The obtained accuracy compares favourably with other approximation methods and with exact solutions. In this paper, we discuss steady state probabilities, but timedependent solutions are also obtainable. 01997 Elsevier Science B.V. Keywords: Approximate solution of discrete time and continuous time Markov chains; Product form; Aggregation; Aggregation/disaggrega.tion algorithms
0. Introduction
Markov chains are widely used in stochastic modeling. We are interested in models of computer, communication, and manufacturing systems. There are discrete time and continuous time Markov chains. The latter can be transformed into discrete time Markov chains which behave equivalently. This is why our technique, which is designed for discrete time Markov chains, is useful for both types. The goal of considering Markov chain models is to obtain performance measures like throughput of jobs, utilization and availability of system components, and response times. Usually, throughput, utilization, availability, and in some cases response times can be derived directly from state probabilities, but sometimes this is difficult for response times [2,6]. Steady state probabilities are solutions of typically large and sparse linear systems of equations. The most important, wellknown, and widely applied modeling paradigms in this field, namely queueing systems, queueing networks, and generalized stochastic Petri nets (GSPN), can be considered as abstract * Supported in part by the Deutsche Forschungsgemeinschaft. ’ Corresponding author: Roemerstrasse 164,53 117 Bonn, Germany. 01665316/97/$17.00 Q) 1997 Elsevier Science B.V. All rights reserved PII SO1665316(96)000533
88
J.C. Strelen/Pet$ormance
Evaluation 30 (1997) 87110
representations of Markov chains if the memoryless condition is satisfied  this can be assumed if all probability distributions are of Coxian type. Unfortunately, the number of Markov chain states grows quickly as the complexity of the model increases. Therefore, much effort has been made on the development of algorithms for large linear systems of equations, like power, GaussSeidel, and SOR algorithms; an overview is given in [ 131. Nowadays, systems with some hundreds of thousands of states can be solved, but the number of states is subject to a combinatorial explosion law. This is why efficient accurate approximate methods are needed. Another problem is different time scales which stem from rare events in the modeled systems (stiff systems). This may lead to very long iteration times when iterative algorithms are applied. Simultaneous iterative techniques [8,16181 cope with this. Our &aggregation aggregation (DA) iteration is a new general approximate aggregation technique for finite Markov chains with discrete time. The calculated state probabilities are in product form; we call them disaggregation aggregation product form (DAH) solutions. Here, neither the complete state space, nor all state probabilities, nor the transition probability matrix are handled explicitly, marginal probabilities are considered instead. Therefore, Markov chains with really large state spaces can be treated, for example billions of states. Time and space demands are higher than for many known, special purpose heuristics, but are by orders of magnitude lower than for exact solutions of Markov chains with many states. On the other hand, the results of DA iteration are usually more accurate than the heuristic solutions. Moreover, the accuracy can be influenced: The more aggregates are considered, the more accuracy can be obtained, if the aggregation is chosen suitably. If many aggregates are used, the aggregation is called$fine, otherwise coarse. It is interesting to note that the steady state probabilities of separable queueing networks are equal to DAH solutions calculated according to suitably chosen aggregates. We analyzed quite a lot of Markov chain models applying DA iteration, and we compared the obtained results with the exact solutions and with results which were calculated using heuristic methods known from the literature. The accuracy of DAH solutions compares favourably. A special class of algorithms for the solutions of large systems of linear equations is based on the principle of iterative aggregation and disaggregation (a/d), a wellestablished solution technique for Markov chains, see for example the survey of Schweitzer [ 141. This technique has some similarities with our DA iteration. Recently Horton and Leutenegger [7] proposed such an a/d method, the MLalgorithm for the solution of discrete time Markov chains, which is interesting to compare with our DA iteration. Like ours, their algorithm is iterative and based on alternate aggregation and disaggregation. The MLalgorithm aggregates over many levels between coarse and fine aggregation. Thus it can be considered as a multigrid like method. It is experimentally shown that it is very fast compared to currently used algorithms. The MLalgorithm is designed for the calculation of exact steady state probabilities whereas the DA method is suited for approximate solutions, here only aggregate probabilities are calculated, not all state probabilities. Hence the MLalgorithm can be applied to Markov chains with some hundreds of thousands of states while this number is not restricted in DA iteration, we tried models with billions or more states. In contrast to the MLalgorithm, DA iteration allows for the calculation of timedependent solutions. In the MLalgorithm, the manner of aggregation influences the speed of convergence; in DA iteration, this is crucial with respect to the accuracy of the obtained results. In this paper, we consider steady state solutions. In Section 1, DAH solutions and DA iteration are discussed. In Section 2, we show that there always exists a product form approximation for any discrete probability distribution, which is as accurate as desired, with only a small number of factors. Practical
J.C. StreledPerformance
Evaluation 30 (1997) 87110
89
aspects are treated in Section 3. As an illustrative example, the techniques are applied to queueing networks with blocking.
1. Disaggregation
aggregation iteration
Irreducible aperiodic Markov chains (Z(t), t E N), discrete and homogeneous in time, with finite state space 2 = [ 1 : n], II E N, are considered. They stem form stochastic models of computer, communication, or manufacturing systems. These systems consist of interacting subsystems, their components. At every moment, each component is in one of its possible component states. This is why we denote the whole system’s states also by state tuples z = (~1, . . . , ZK), where Zk indicates the state of component k. The states of component k are numbered Zk = 0, . . . , Nk for all components k = 1, . . . , K. Aggregates are sets of system states. An aggregate can be defined by the state of a certain component, for example the aggregate 2k, j is the set of all system states where the component k is in state j, irrespective of the states of the other components. All aggregates 2k.0, . . . , &Jk form a partition of the state space, for k fixed. In the sequel, we will use the notions “partition” and “aggregate” slightly more generally. Therefore, we subsets of introduce them more formally: A class {2k,u, . . . , _&&}, Nk E N, k E N, of disjoint nonempty 2 whose union is the full set 2 itself is called partition 2k of 2. A class {Zt , . . . , Z,}, C E N, of partitions is called an aggrega,tion AGG of 2, and each of the subsets 2k, j of system states is an aggregate. x [0 : NC] indicate the aggregates 2k,z,, of each partition Stutetuplesz=(:z~,...,zc)~[O:N~]x~~~ 21, . _. ,ZC to which a system state belongs. If for all state tuples the intersections 21 ,Z1 n . .  fl ZC,~~ contain at most one element, each state tuple defines at most one system state z E 2, and the aggregation is said to completely identify the states. If the intersection is empty, there is no according state, and the state tuple z is termed in.usibZe, otherwise z describes a state and isfeasible. In the sequel we consider mostly aggregations which completely identify the states. Thus, for each state z E 2 there is one and only one such state tuple z. The set 2r of all these feasible state tuples is another representation of the state space 2. There is a onetoone mapping Z : 2r onto 2. A special aggregation is defined by the structure of the model as follows. Consider a system which consists of K subsystems, for example a queueing network with K nodes. If each aggregate is defined by the state Zk of a single subsystem, 2k.j = {Z(z) I z E 2r, zk = j}, j = 0, . . . , Nk, k = 1, . . . , K, where [0 : Nk] is the set of all states of subsystem k, then we call the aggregation canonical. If the whole system’s state is uniquely defined by the states of all subsystems, this cannonical aggregation completely identifies the states. Consider for example a closed queueing network with N jobs circulating in it. Zk denotes the number of jobs in node k. We assume that these numbers zk describe the states of the nodes completely. Under these circumstances, any state tuple (zt , . . . , Zk) where ZI + e . . + ZK = N is feasible and indicates one and only one system state. If zt + .  . + ZK # N, the tUpk! iS infeasible  the State (~1, . . . , ZK) CannOt occur. p(‘)(z)L P(Z(t) = z), z E 2, denotes the state probabilities of the Markov chain at time t. For convenience, we also use p(‘)(z), z E 2r, where p(‘)(z) = p(‘)(z) holds if z and z denote the same state, i.e. z = Z(z) holds. In the sequel, if z and z occur in the same context, they denote the same state. The probability vector p(‘) = [
[email protected])(l), . . . , p(‘)(n)] contains all state probabilities at time t, and PA [p(x, y)]: +
denotes the matrix of the transition probabilities. For convenience, we use also p&y),
%I
J.C. StreledPerjotmance
Evaluation 30 (1997) 871 IO
x,y E ZT, where p(x,y) = p(x, y) for x = Z(X), y = Z(Y) holds. Sometimes, if we are not interested in a particular time, or if steady state probabilities are considered, we omit the specification (t). Each aggregate 2k, j defines a macro probability or aggregate probability which is a marginal probability as follows: E &,j} = c
rk(j)AP{Z
P(Z),
j = 0,. . . , Nk, k = 1,.
. . , c.
(1)
ZEZk,j
we Collect the marginal probabilities
intO tUpkS
rk A (rk (0))
. . . , rk
(&)) , k = 1, . . . , c, and, in tWU,
theSe
intoRs(rt,...,rc). By (l), the aggregationfunction A : [0, 11” + [0, llN’+’ x ... x [0, llNc+‘,
p H R,
is defined. It calculates the macro probabilities for given state probabilities. Sometimes joint marginal probabilities P{Z E &,j,
r&#(j, j’)i
Z E zkl,jl) =
c
P(Z),
zezk,jnzk!,jl
j =O,...,
Nk, j’ = 0, . . . , Nk!,
{k, k’} C [l : C],
are used. Again we collect them into tuples rk,k’. Each partition 2k, k E [l : C], defines a random variable Zk : ZT + [0 : Nk], z H zk. The marginal distribution rk is the probability distribution of Zk. Furthermore, sometimes it is more convenient to consider the random vector 2 = (Zt , . . . , Zc) for which P (2 = z} = p(z), z E ZT, holds, instead of the random variable Z. Our proposed solution technique needs &aggregation: Given positive macro probabilities R, a distribution 7r = [n(l), . . . , n(n)] over 2 is defined as follows. First, it satisfies the macro probabilites, R = d(?r), and secondly, 7r maximizes the entropy. Thus, according to the productform theorem in [ 10, Theorem 13.2.1, p. 4091, this maximum entropy distribution (MED) ?r is uniquely determined and has a product form 1 c T(Z)
=
c
n
(2)
Pk,Zkr
k=l
where the &,zk > 0 are some real numbers, rk(j)
= $
c
zezr
fI
Pi,zi,
j = 0, . . . ,Nk,
k=l,...,
C,
(3)
i=l
Zk’j
holds, and G =
c fi zE2T
k=l
Pk,Zk
(4)
J.C. StreledPerfonance
Evaluation 30 (1997) 87110
91
is a normalizing constant. In fact, the MED is completely determined by all MED parameters short. By (2)(4), the dimggregationfunction is defined: V : [o, l]N’+l
x .a. x [o, llNc+’
+ [O, lln,
pk. j, p
for
R H x.
It calculates the MED over the state space for given macro probabilities. In general, the nonlinear equations (3) and (4) must be solved in order to calculate all MED parameters Pk,j and the normalizing constant G of the MED X, given the marginal probabilities R. This is easily accomplished in some special cases. For example, if a canonical aggregation completely identifies the states, and the state space is rectangular, i.e. if ZT is a Cartesian product, ZT = [O : Nt ] x . . . x [0 : NC], the MED is given by p(z) =rt(zl)...rc(zc),
z E ZT,
(5)
which is easily verified (see [18]). Here the MED parameters are Pk,j = tk(j), j = 0, . . . , Nk, k = l,..., C, and G = 1 holds. This can be generalized for the case where ZT consists of the union of Cartesian products, see Section 3. Another simple case where the state space is not necessarily rectangular are the state probabilities in product form queueing networks. Instead of calculating the steady state probabilities q of a Markov chain we determine an approximation p = D(R) where R I= I(R) holds. Here the function 7 is defined as follows: 7 : [o, llN1+’
x + x [o, @+’
+ [o, l]N1+l
x . . . x [O, l]Nc+l,
R t+ A(D(R)P).
This approximation,p and its marginal probabilities R are referred to as disuggregutionuggregutionproduct form (DAl7) soZutio,rz.In fact, only the marginal probabilities R and the MED parameters p and G must be calculated. This is practicable even if the state space is large, so large that neither all state probabilities nor the transition probabilities could be handled. We calculate the IDAIl solutions using disuggregution aggregation (DA) iteration, R(“+‘) = I(R(‘)),
t = 1,2, . . . ,
(6)
until IIlaXk,j Irk“+“~:j)  r:‘(j) 1is sufficiently small, beginning with feasible marginal probabilities R(l). This DA iteration resembles vector iteration,
[email protected]+‘) = p(‘)P, t = 1, 2, . . . , beginning with any distributionp(‘).p(‘) are the timedependent state probabilities and d(
[email protected])) the timedependent macro probabilities. p(‘) converges to the stationary probability vector and d(p(‘)) to the stationary macro probabilities. Accordingly, when DA. iteration R(“+‘) = I(R(‘)), t = 1,2, . . . , is performed, R(‘) are approximations of the timedependent macro probabilities and D(R(‘)) approximations of the timedependent state probabilities. If DA iteration is done for a sufficient number of iterations, R(‘) is an approximation of the stationary macro probabilities, and D(R(‘)) an approximation of the stationary state probabilites. It is important to note that the function 7 must be be evaluated without handling either all transition probabilities or all probabilities ?r (‘) . Whether this is possible, depends on the transition probabilities and on the aggregation. According to our experience, such conditions are fulfilled very often in Markov chains originating from performance or reliability models. We have not yet found any model which could not be treated with DA iteration although we tried many different models. In Section 3 conditions are given under which this is possible.
J.C. StreledPerformance
92
Evaluation 30 (1997) 87110
More precisely, the conditons are fulfilled if the following expressions for the joint probabilities qk(j, 1) of the random vectors (ZF’ , Zf+t)), Wc(j,l)=
c
c
j,Z=O
P(.Y,z)P(%‘),
,...,
Nk, k=l,...,
c,
(7)
yEZT LEZT yk=j Zk=l
can be simplified substantiallyby Observing
collecting many summands into only a few expressions.
z=o,...,Nk,
(8)
j=O Nk
fPkok(j, i> =
rf’(j)+&(j,z),
j=o ,...,
Nk,
k=l,...,
C,
(9)
I=0
follows. Eq. (9) is already simple in the sense mentioned above. The conditional probabilities Xk((j,
i)AP(zf+‘) = zlz;’ = j} = E,
j,z= 0, . . . , Nk,
k = 1, . . . , c,
(10)
are aggregate transition probabilities as the following aggregate transition equations: ‘
[email protected]+‘)(I)
= 2
nk(j,
z)rf)(j),
1 =o,...,
Nk,
k=l,...,
C,
(11)
j=O
hold. The aggregate transition probabilities are not homogeneous aggregate nontransition probabilities are
in time, in general. According to (9), the
Nk nk(j,j)=lxnk(j,Z),
j=o
,...,
Nk,
k=l,...,
C.
(12)
l=O 1Z.i
Eqs. (8) and (9) or (11) and (12) are used to efficiently calculate the new macro probabilities R(‘+‘) from the old ones, R(‘), thus evaluating the function 1. If no closed form solution is available, the nonlinear equations (3) and (4) can be solved iteratively as follows [9] for the MED parameters p given the macro probabilities R:
[i+ll pk,j
=
rk(.i)
1  Oi + Wirfl( j)
[il
pk,jI
j =O,...,
Nk.
k=l,...,
K, i=l,2
,...,
(13)
where the damping factor is chosen as wi = max{ 1, i, $, . . .) so that pl”+‘l fulfills Eqs. (3) and (4) more accurately than $1. Here ~1’1 are initial approximations of the unknown parameters p, and r/‘(j) are the macro probabilities according to the MED parameters ~1’1, see (3). But in fact, iteration (13) can be omitted when steady state solutions are desired. It is sufficient to apply just one iteration step,
J.C. StreledPetiormance
Evaluation 30 (1997) 87110
u+l)
93
(14)
pk,j
within each DA iteration step (6). This iteration did never diverge, using a damping factor wi = 1 or wi = i. We summarize, st,ating the following algorithm.
DA iteration algorithm Input: Reasonable MED parameters p(l)  for example all pi’$ = 1  and a small bound E > 0
(0) Calculate the normalization constant G(l) and the marginal probabilities R(l). Let t = 1. (1) Calculate the joint distribution functions (pk(j, 2) for all j, 1, k, or the aggregate transition probabilities Xk(j, I) for all j, I, k. (2) From these, calculate the new macro probabilities R(‘+‘). (3) Calculate the MED parameters p (‘+l) either directly, for example using (5), or approximately using (14). In the latter case, calculate again the macro probabilities R cr+‘) from the new MED parameters p(‘+‘). Calculak the normalizing constant G(‘+‘). c E holds, the iteration is completed. Otherwise let t := t + 1 and go to  r;‘(j)1 (4) If IIlaXj,k Irf+‘)(j) step (1)
Results: The macro probabilities
[email protected]+‘), the MED parameters p(‘+‘), and the normalizing constant G(‘+‘). Now we are going to prove the existence of DAD solutions. This is accomplished applying the following theorem, see e.g. [12, Section 6.3.21.
Theorem 1 (Brouwer fixedpoint theorem).
compact, convex set Pn, and suppose that I’(p,,)
Theorem 2 (DA product form existence).
be a continuous operator on the c Pn. Then 7’ has a&ed point in p,.
Let 7' : ?,, c
W
+
R"
The equation
R = 7(R) has a fixed point.
Proof. Instead of ‘T’,we consider the operator 7’ : Pn + Pn:
p H 2>(d(p))
*P.
(15)
Fn = Ip’ E [O, l]“]pi + ’ . ’ + ph = l} c R" is the set of all probability distributions over [l : n]. Obviously, pn is bounded and closed, therefore compact, and convex. Now we prove that 7’ is defined and continuous. In the sequel, we only consider marginal probabilities R, for which a probability distributionp exists such that R = A(@) holds. Therefore we may assume that always a MED exists with the constraints that the marginal probabilities R are assumed. Case I: ZT is rectangular. Firstly, letp E Pn, Ri d(p). Clearly, R depends continuously onp. Secondly, let p’(z) = r1(z1)*..rc(zc),
z E ZT.
(16)
p’ is a probability distribution, p’ E p,, which depends continuously on R. In the next paragraph, we show that ( 16) is equivalent top’ = D(R). Thirdly, p’P is a continuous mapping. Consequently, 7’ (15) is defined in this case, and it is continuous.
94
J.C. Strelen/Pe$ormance Evaluation 30 (1997) 87110
If all marginal probabilities rk (zk) are positive, p’ is a MED, according to the product form theorem. If there are marginal probabilities rk (Zk)which are zero, p’ is also a MED, as can be seen as follows: Let 2 = zT\(z. E ZT13k, j
:
z
E
2k.j
where rk(j) = O}.
According to the product form theorem, C(z) = p’(z),
z E 2,
(17)
is the MED, given the positive marginal probabilities as constraints. For each state2 E 2k, where rk (j) = 0, p’(z) = 0 must hold if the constraints R = d(p’) are given. This is fulfilled by p’ in (16). Therefore p’ maximizes the entropy and is in accordance with the constraints. j
Case 2: ZT is not rectangulal; but a proper subset of [0 : Nl] x . . . x [0 : NC]. Therefore [O: Nl] x *** x [0 : Nc]\Zr is not empty. Here the operator A is defined by q(j)=
cp*(z), z&2* Zk’j
j=o
,...,
Nk,
k=l,...,
zRb
C,
where 2* = ZT ifp* is a probability distribution over ZT, and 2* = ZR or 2* = [0 : N1] x . . . x [0 : NC] accordingly. Again, letp E p,, R 6 d(p). We apply the subset independence feature of maximum entropy distributions which is stated in the Shore and Johnson paper [ 151, in order to prove that 7’(p) is defined and continuous. For our purposes, this feature can be formulated as follows. Let3 denote any probability distribution over ZR where all e(z) > 0, z E ZR, R”‘L d(j),
w any probability, 0 < w < 1, and
R” = wR + (1  w)R”‘.
Clearly, R” = d(wp + (1  w)j) holds. Therefore there is a MED p” of form (16) obeying the constraints R” =
[email protected]“). According, to the subset independence feature, p” = wp’ + (1  w)p”’
holds, wherep’ is the unique MED over ZT with the constraints R = d(p’), andp”’ the unique MED over ZR with the constraints R”’ = d(p”‘). Keeping R”’ and w fixed, R” depends continuously on R. According to (16), applied to R” and p”, p” depends continuously on R”. Observing p’ = (p”  (1  w)p”‘)/W, p’ is defined, and depends continuously
on p”. Together, p’ depends continuously on R and thus on p. In each of the two cases, we saw that 7’ is defined. Moreover, R = d(p),p’ = D(R), and p’P are continuous in each case. Therefore, 7’ is continuous. Furthermore, for eachp E pn, I’(p) is a probability distribution, i.e. I’(p) E p,,. So far we have proved that 7’ and’:, fulfill all assumptions of the Brouwer fixedpoint theorem. As a consequence, 7’ has a fixed point in Pn, sayp.
Let Ri d(p). In turn, I(R) =
[email protected](R) . P) = d(D(d(p)) R is a fixed point of 7. This concludes the Proof. 0
. P) = d(l’(p))
= d(p)
= R holds. Clearly,
J.C. Strelen/Pe&mnance
Evaluation 30 (1991) 87110
95
Unfortunately, we have not yet found error estimates or a proof of convergence of the iteration. But in Section 2 it is prove:n that there is always an aggregation with only a small number of aggregates for which an approximating product form (2) exists which is as accurate as desired. Also we solved many models using DA iteration, among others polling, forkjoin, stochastic Petri net, and blocking models, and we found very accurate results; the iteration did never diverge. Moreover, the accuracy could be substantially ameliorated in all examples if rnore and suitably chosen aggregates are considered. These matters are discussed in Section 3. 2. Motivation In this section, we justify why it is worth applying DA iteration to Markov models. Firstly, we point to our just mentioned favourable experience with DAII approximations for steady state solutions of Markov models: We calculated DAD approximations for many models using DA iteration. Usually, we tried different aggregations for the same model, coarse grain with a small number of aggregates, which use a small amount of computing time, and fine grain with more aggregates, which need more computing time and storage space, but produce on the other hand even more accurate approximations. In Section 3, these topics are pointed out. In [ 18,191, DA iterative solutions for polling systems are treated, in [ 181 for queueing networks with a forkjoin subnet, in [4,1 I] for some generalized stochastic Petri nets, and in [20] for queueing networks with repetitive service blocking. The obtained accuracy compares favourably with results calculated using other methods from the literature, and with exact results. We describe the blocking networks in some detail: Example. We consider queueing networks with repetitive service blocking. A network consists of K E N nodes. In node k, k = 1, . . . , K, there is a single exponential server with service rate pk. Buffer space is available for no more than Ukjobs. If the model is open jobs arrive according to a Poisson process with parameter hk. When the service of a job in node k is finished, the job tries to go to node 1 with probability tk,J, 1 E [ 1 : K]. If there is no buffer space available, the job remains in node k and instantaneously receives another service (repetitiveservice with random destination policy). In an open network, a job which is completely served in node k leaves the network with probability tk,o. In a closed network, there are N E N jobs, and we assume without loss of generality Vk 5 N, k = 1, . . . , K. In Section 3, we outline the analysis of these blocking networks using DA iteration with coarse and fine aggregation, and w’e sketch some numerical results. Secondly, we look on general product form approximations n(z) = pt ,Z,  .  pc,zc po, z E ZT, for probability distributions q. We will show that there is always such an approximation with a small number of factors (the MED parameters Pk,Zk), given any degree of accuracy, namely an E’ such that )IR  qll1 5 E’ holds. The proof is constructive: The aggregation of the product form approximation 7r and the MED parameters are stated. This is accomplished directly, not using marginal probabilities as restrictions when maximizing the entropy. For the treatment of general product form approximations 7r we do not require that the aggregation identifies the states completely. Thus the mapping Zt : 2 + ZT is not always onetoone: Given a state z E 2, and letting z: = (~1, . . . , ZC) = Z‘(z) denote the state tuple belonging to z, z E 2k,Zk holds for all
96
J.C. StreledPe~ormance
Evaluation 30 (1997) 87110
k= l,... , C, but there may exist another state z’ E 2 with the same tuple, z = Zl (z’). Here we restrict ourselves to partitions consisting of two aggregates each, 2k = {2k,O, 2k,J}, k = 1,**a, c. The following theorem states the existence of a general product form approximation 7~, which approximates a given probability distribution q over 2 accurately. In the first part of the proof, the aggregation is provided, and the MED parameters pk,j, k = 1, . . . , C, j = 0, 1, are given. Theorem 3 (Accurate product form). Given a probability distribution q = (q(l), . . . , q(n)) and a degree E of accuracy, 0 C E 5 0.01, there exists an approximating product form distribution ?r = (n(l), . . . , n(n)) with CiCi
In I
( _:;;/‘:,,
+ 1) /ln21
(18)
partitions, each of them with two aggregates. The accuracy is llq  7~11~< 4.26
(1%
0~ more precisely,
Vq(z) 5 P,
Iq(z)  Jr(z)1 < ;C
ICI(z)  n(z)1
< 3.1E q(z) 14(z)  n(z)1 < 2.16 z= l,...,n,
ifp I q(z) 5 1,
(20)
if1  6 i q(z) 5 1,
where &
(1  2#1.
(21)
Proof. The proof is constructive: We give the product form distribution and the aggregation belonging to it, and afterwards we will prove the quoted statements about it. The aggregation consists of C partitions 2k = {2k,u, 2k, 1 ), each with two aggregates. Let C= C
(this is preliminary, perhaps C will decrease later on),
b=l2E, f’k,O=
,ok,l =
1,
b2k‘,
k = 1, . . . , t,
1, = Lln q (z>/ In b + 0.5 J ,
ii_=
El.
ifqtz)
5 P,
bLZ ifp
z E 2.
(22) (23)
(24)
The aggregation is obtained as follows. From p < q(z) 5 1  26, 1 5 I, 5 2c  1 can easily be shown. Therefore we may represent the integers lz as binary numbers,
J.C. Strelen/Pe$onnance
‘e
Evaluation 30 (1997) 87110
97
1
4 = 1,,,2° + 1,,22l + . * * + lz e.2  (
which defines the figures &k E (0, l), k = 1, . . . , t?, uniquely. Thus we have by (23) if&)
P=Pl,l...Pe1 5,
=
&“20&,2’2’
. . . &,,~.2c’
=
p, 1 . . . pe I 3 z,I
1z,c
if p < q(z) 5 1  E, ifl6
1 = Pl,o***P&
5 P,
(25)
Clearly,
where &A 1 if q(z) < p, and &&A 0 if 1  E < q(z), k = 1, . . . , C?,is in accordance with (25). From this representation, a preliminary aggregation is obtained: A state z E 2 belongs to the aggregate zk, j iff in (25) Pk, j is a factor of ii,, i.e. &,j=(ZE2:(l,,k=j},
j=O,l,
kzl,...,
C.
If an aggregate i!k, j is empty, then all it, have the factor ,0k,1_j . Hence po is replaced by PO&, 1j, the partition _& is discarded, the numbering of the remaining partitions is adjusted to 1, . . . , C  1, and C is reduced by 1. If there are two partitions 2k and zk’, for which there are numbers j, j’ E {0, 1) such that 2k, j = zkf,j' holds, then all it, haVe the factorspk',j'pk,jOr &f,ljf&lj. Hence We EpklCe pk,j by pk',j'pk,jad pk,1j by Pk’, I_ j'&'k, 1_j, and discard the partition 2k’ as jUSt explained. This was the more interesting part of the proof, namely the definition of the approximating prodcut form distribution and of the aggregation belonging to it. The second part is to prove the accuracy (20), from which (19) follows immediately. The arguments are straightforward, lengthy, and consist in applying some Taylor series with remainders and simple estimates. This second part of the proof is given in Appendix A. 0 Remark. The aggregation has C 5 C partitions, each of them with two aggregates. This C is also the number of factors in the approximating product form. There are C degrees of freedom, namely the MED parameterSpk,l,k = 1, . . . . C (the parameters pk.0 are unity, and the normalization constant is determined by the MED parameters). The number C is O(log log n) for E fixed, and O(log l/c) for Iz fixed. For example, if the number of states IZ varies between 1000 and 1027, and the degree of accuracy E between 0.01 and 10F5, e is not greater than 22. As a consequence of Theorem 3 for any probability distribution q with n  1 degrees of freedom n might be large  there exists an accurate product form approximation ?r with a small number C of degrees of freedom.. This is the important feature for our motivation: It is worth searching for product form approximations.
98
J.C. Strelen/[email protected]
Evaluation 30 (1997) 87410
3. Practical aspects If a Markov model is to be analyzed with DA iteration, the aggregation, i.e. the partitions of the state space, must be designed. The aggregation has crucial influence on the accuracy of the results, on the required computing time and storage space, and on the tractability of DA iteration: In the DA iteration algorithm, given the MED parameters pk,j, the normalization constant G, the marginal probabilities rk(j), and the aggregate transition probabilities I’Ck (j, 2) must be calculated in an efficient manner, The defining equations of these quantities, (4), (3) and (7), (10) are composed of too many summands. It depends on the manner of aggregation if this is possible. For the definition of the aggregation, we consider the modeled system as being composed of componentsk,k = l,..., K. Each System component k may assume a state Zk of its own state space Sk, for example Sk = [0 : Nk], Nk E IV. Consequently, the state of the whole system can be described by Ktuples 2 = (~1, . . . , ZK) E S1 x . . . x SK. Clearly, the state space is a subset of this Cartesian product, ZT59, x***xSK. Example. In the queueing network with blocking, the system’s components are the nodes. Their states are the numbers Zk of jobs in them, k = 1, . . . , K. Therefore, a system’s state is given by the Ktuple ZK). In an open queueing network, z=(zt,..., 2T=S1
x***xsK,
where Sk = [0 : vk] is the state space, and in a closed queueing network
(26) where Sk = [Tk : vk],
Tk =
max
A state rk > 0 in node k occurs if in no other node space is left. In this section, we consider only aggregations which identify the states completely. In Section 1, canonical aggregation is defined as follows: There are C = K partitions 2k, k = 1, . . . , K, and the aggregates of each partition 2k are defined by the states of tbe system component k: &,j
= {Z E 2IZk = j},
jESk,
k=l,...,
K.
(If z and z occur in the same formula, both denote the same state: z = Z’ (z).) Canonical aggregation is the coarse grain aggregation. Example. In the example blocking network, each canonical aggregate 2k, j consists of all system states where j jobs are in node k. If an aggregation is searched which promises accurate approximate results, it cannot be determined as we did in the proof of the accurate product form theorem, as  the exact solution is not known and
J.C. Strelen/Perfomance
Evaluation 30 (1997) 87110
99
 the obtained aggregation could be of such a kind that the normalization constant G, the marginal probabilities, or the aggregate transition probabilities nk(j, 1) cannot be calculated efficiently. But how should the aggregates be chosen? Two observations are important for this question. If a system consists of subsyste!ms which behave stochastically independent, and if a Markov model for this system is solved using DA iteration with canonical aggregation, canonical with respect to these components, the DA solution is exact. In this case, if the state space is rectangular, the transition probabilitity matrix of the Markov chain is a Kronecker product, see for example [3]. On the other hand, if a system consists of highly dependent components, the canonical approximation is expected to be inaccurate. Consequently, accurate approximations can be expected if the components are nearly independent. This observation serves as a guide for aggregation: Look for independent components, and design the aggregation accordingly, or combine the clusters of dependent components, each cluster into a single new component, in order to get nearly independent (components. In general, this produces more aggregates and requires more computing time. Bark [l] does some research on these topics. Now we propose some ideas on how fine aggregations can be obtained which take into account such dependencies and the other mentioned requirements. A partition can be defined by two or more dependent system components, for example the partition Z(k,kt),(j,j’) = {Z E Z(Z~ = j,zk'= jr}, j E Sk, j’ E Sk’, k # k’, k, k’ E [l : K],
(27)
combines the components k and k’.
Example. In our blocking networks, we may define aggregates according to the state of two nodes. Thus the dependencies between two such nodes are completely taken into account. In general, not all partitions 2(k) &), (kl, k2) E [l : K12, are considered, but only for a subset K2 c [l : ICI2 of node pairs which are expected to behave quite dependent. Sometimes the dependencies between a certain node k* E [ 1 : K] and the other nodes are of particular importance, for example in a star topology. Then we try center aggregation with the partitions .+*,k),
k = 1, . . . , K,
k # k*,
lc2 = ((k*, k)(k = 1,. . . , K, k # k*},
and the partition zk* . Node k* is called the center. In an open network, the number of aggregates is
Sometimes the slates of distinct pairs (kl , k2) of nodes depend strongly. Under such circumstances one can put these pairs into K;! and consider the according partitions z(k) ,k2), (kl , k2) E IC2. If K is even, and if all nodes of the network are combined into such pairs, Kc2= {(per(l), per(V),
(per(3), per(A)), . . . , (perk
 11, per(K)))
holds, where (per(l.), . . . , per(K)) is a permutation of (1, . . . , K). We call this paired aggregation. open network, the number of aggregates is >:
6’kl
+
l)(Uk2
+
1).
In an
J.C. Strelen/Pe$wmance Evaluation 30 (1997) 871 IO
100
Clearly, the states of two nodes k and k’ are dependent if the nodes are connected, i.e. if jobs travel from node k to node k’ or vice versa. Therefore it seems reasonable to consider all these pairs: lc2 = ((k, k’)lk < k’, tk,k/ > 0 or tk’,k > O}. We call this the all connected pairs (ACP) aggregation. If we restrict ourselves to tandem queues where jobs arrive from outside at node 1 with rate hl , travel from K  1, and leave the network at node K, then Ic2 = (( 1,2), (2,3), . . ., nodektonodek+l,k= l,..., (K  1, K)} is the set of considered pairs for ACP aggregation. Now we discuss an aggregation which is even finer than the ACP of nodes aggregation, the all connected triplets (ACT) aggregation. Again we restrict ourselves to open tandem queues as above. The triplets of nodesare(k,k+l,k+2),k=l,..., K2.ThereareC= K  2 partitions 2k of the state space with the aggregates zk,(‘, jl,j")= (z(Z)lZE zT,Zk = j, Zk+l = j',Zk+2 = j'},
(j,j', j')E
&
X &+I
X &+2,
k = l,...,C.
In order to obtain a fine aggregation, a virtual system component can also be used. Its states s E S are also a function of the system’s states, D : 2 + S, like the states of real components. Useful for the definition of fine grain aggregations are, for example, the number of jobs in an open queueing network, i.e. system components, i.e. s =zt +*a * + ZK,Z E ZT, or the set of empty/nonempty S = (s E (0,
l}KISk =
sign(Zk),
k = 1, . . . , K, z E zT}.
In general, s = {a(z)lz E 21 holds. Applying virtual system components, the aggregates may be defined by zk,(j,s) =
(Z E 2l.Q
=
j,o(Z)
= S),
j E Sk,
S E
S,
k = 1, . . . . K.
Another partition ZC, C = K + 1, is defined by the virtual component. Its aggregates are 2c.s = (z f 210(z) = s},
s E s.
In this case, or similarly if the system’s component C represented by the component state space S, is a real one, the disaggregation is sometimes quite simple. We mean the case where the state space is the union of some disjoint Cartesian products as follows: ZT = u
2;,,
with 2:,,
= Si (s) x
a e. x Sh_1
(s)
x (s},
SES
where the sets S;(s) 2 Sk may depend on s E S. Assume the marginal probabilities ?k(j$ S) = P(Z E zk,(j,s)}, j E Sk, k = 1,. . . , C  1, and rc(s)
= PI2
E ~c,~I, s E S,
J.C. Strelen/Performance
101
Evaluation 30 (1997) 87110
are given. Consider a given value s E S. Because of the rectangular form of the subspace ZC,~, P(Z
= ZlZ E ZC,S} = P(Z1 = z1 IZ E ZC,S}. . * P(Zc1 v(z1,
=
Tc(s)
s)
rcI(ZCI),
=
zc1IZ
E
Zc,,]
s z E &,s7
.**
w(s)
’
holds. From this, fo!rall z E ZC,~ and all s E S, in turn P(Z =
z}= P{Z
= z,
z E 2c,sl
= P{Z = ZlZ E 2c,s] * PIZ r1 (z19 s) =
. . .
rc1
w(s)
(zc19
E 2c,sl s) s E s,
. W(S), w(s)
hold. Thus r&i, s) j~$(s), Pk,(j,S) = 9 V(S) PC,S = V(S), s E s,
k=l,...,
Cl,
673)
are the MED parameters.
Example. In our blocking network, if center aggregation is applied, let
k* =
K = C, i.e. s E S are the
states of the center. Now we turn to the problem of efficiently calculating the normalization constant, the marginal probabilities, and the aggregate transition probabilities. If the state space is rectangular, ZT = [0 : Nt] x . . . x [0 : NC], for example in an open queueing network, the normalization constant G = c zE2T
fi
Pk,zk
k=l
is efficiently calculated by
(29) This technique can easily be generalized if the state space is the union of some disjoint Cartesian products. We call a state space ZT = (2 E [O : Nl] x . . . x [O : NC]lZl +. . . + zc = NJ
(30)
a closedform state space, because closed queueing networks have a state space of this kind. In this case, the normalization constant is efficiently calculated using convolution. Let &, k = 1, . . . , C, denote the vectors b’k.0, . . . 7 pk,Nk) of MED parameters. The normalization constant G is given by the Nth component of the convolution vector Iof all vectors &, k = 1, . . . , C: G = (pl @ . . . @ P~)N.
(31)
102
J.C. StreledPerfonnance
Evaluation 30 (1997) 87110
There are more general state spaces, for which both simplification techniques are applied. The efficient calculation of a marginal probability rk (j) (3) can be done accordingly: In formula (4) for the normalization constant G, pk.1 is replaced for all E # j by zero, and the sum is divided by G. More general, probabilities P{Z E X} = &X ~1,~~. .a ~c,~c/G where 2 = (21, . . . , ZC), X = Xl x **a X xc, & C [o : Nk], k = 1, . . . , C, can efficiently be calculated as follows: If the state space is rectangular, 1 c P{Z E x) = G n P(Zk E &}
(32)
k=l
holds. If the state space is closed form (30), P{Z E x
n 27
= (Xl @I. *. @x~)~/G
(33)
holds, WhtX'exk= (Xk,O,. . ..xk.,Q), if j E &, otherwise
jE[O:Nk],
k=l,...,
C.
The proof of the convolution formula (33) is straightforward. Now we turn to the problem of efficiently calculating the aggregate transition probabilities. In general, this task needs O(n2) operations. This is why we restrict the considered class of state transition matrices. We assume that the Markov chain transitions from state y into state z, y # z, y, z E ZT, can be classified as follows: Each state transition belongs to one and only one of the transition clusses, which are numbered 17 *a, v. A transition y into z of transition class u may occur only if y E U,, where U, denotes the Source states set of this class. The destination state function uy : ZT + ZT defines the new states, i.e. a transition of class u leads from state y E U, into state u,(y). uV consists of component functions u:&
uv = &JY)~ ***, &(Y>).
(34)
Transitions of class u occur with probability c+. In order to reduce the complexity of the calculation of the aggregate transition probabilities, we restrict the sets of source states to U” = L&t x *. * x U”,C ll ZT
(35)
where a?&,kc
[o : Nk],
k = 1, .
. . , c,
V =
1,
. . . , t,
and we assume that the destination state functions U: k depend only on yk. In this case, there are functions U,,k : &,k + [0 : Nk], for which Uy,k(y),y E L4,, holds. Therefore we may consider U”O) = (%J,l(Yl).
.9
U”,C(YC))
(36)
instead of (34). The reader may note that this description of the transition classes depends on the chosen aggregation.
J.C. StreledPerformance
Evaluation 30 (1997) 87l 10
103
The introduction of transition classes not only helps to obtain the lower complexity, but also serves to structure the design of the Markov models.
Example. We consider open queueing networks, and canonical aggregation. The system may change its state due to an arriv,alat node k from outside. Then the source state set is u, =
{Z E 9IZk
=Cvk}.
That means, there is space left in node k for an arriving job. The destination state function is given by uv,k(Zk)
=Zk
uv,i(Zi)
+I,
= Zi
if i # k.
The transition probability is
A is the randomization constant: Initially, the Markov chain is continuous in time. Let Q denote its generator matrix. We consider the discrete time Markov chain with the transition probability matrix P = QA + Z instead, where A == 0.99/ maxx 1qX,X I. It has the same steady state probabilities as the continuous time Markov chain. By the factor 0.99, numerical problems are avoided, which may occur due to rounding errors. Other state transitions take place when a job of node k leaves the network. The according transition class u is defined as follows. The source state set is & =
{Z E ZTIZk
> 01,
the destination state function &,k(Zk)
=Zk

Uv,i(Zi)
1,
= Zi
if i # k,
and the transition probability oy, = ,Uk(l  tk,t  *** tk,K)d. The third kind of state transitions occur when a job leaves node k for node i, i # k. In node k, there must be at least one job, and in node i buffer space must be available. Therefore, the source state set is Uv =
{Z E 2I’IZk
>
0,
Zi < Vi}.
The states of the nodes k and i change, h,k(Zk)
= Zk 
1,
Uv,i(Zj)
and the transition probability is
= Zi +
19
u”,~(z~)
= Zm
ifm
# i, k,
104
J.C. Strelen/Perfommnce
Evaluation 30 (1997) 87110
Applying the transition classes, the state transition probabilities of the Markov chain can be expressed as follows:
v y # z, y, z E ZT, P(YVZ) = c 10 E U”)l(U”(Y) = Z)% lJ=l v y E ZT, P(Y,Y) = l CZ(y EU”)&J,
(37)
v=l
where I : {true,false} + (0, 1) denotes the indicator function. From definitions (7) and (10) of the aggregation transition probabilities, one obtains for j # 1, using expressions (37) for the transition probabilities, the sets UV of source states (35), and the destination state functions uV (36), in turn
nk(j, hk(j) =
c c
P(Y*Z)Pcy)
YEZT ZEZT
Y&=j Z&=1
+2T
re_T
u=l
Z(Uv,k(Yk) YEW Yk'j
v=l
=
a,P{Z
E
=
OP(Y)
Uv,l x ***
x &kl
x 1.d x %,k+l x *.  x uv,C)
0
if j
(5 Uv,k, U&k(j)
= 1,
otherwise. (38)
If one of the formulae (32) or (33) applies, the probability in formula (38) can be evaluated efficiently. The aggregate transition probabilities Irk (j, j) are efficiently calculated using (12). Example. Here we consider only closed queueing networks with blocking, and canonical aggregation again. The only transitions which occur are of the third kind: a job changes from one node to another. Observing the actual source state sets, destination state functions, and transition probabilities, one obtains the aggregate transition probabilities, using (38),
c
nk(j, &k(j) =
I
pktk,iAP{Zk
kZi z/Liti,kdP{Zk
= j, Zi <
Vi}
= j, zi > ri)
if j > rk,
1 = j  1,
if j < vk, l = j + 1.
k#i
Clearly, P(Zk = j, Zi < Vi} equals P{Zk = j)  P(Zk = j, Zi = vi) (observe that the last probability iS 0 if (j l Vi) 4 [tk k ti : N]). If 1j  II > 1, the aggregate transition probability is 0.
J.C. StreledPe~ormance
Evaluation 30 (1997) 87110
105
At last we mentio:n some numerical results for open queueing networks with repetitive service blocking. The reader may note the gain in accuracy when finer aggregation is used. Example. A tandem network: K = 4 nodes, capacity u = 3 in all nodes, arrivals at node 1, h 1 = 1, service 20, transfer probabilities tl.2 = t2.3 = t3,4 = t4,o = 1. rates~r=~2=~4.=1,~3=1,2,..., The maximum absolute error occurs when ~3 = 20. It is 0.13 using canonical aggregation, 0.02 using center aggregation, 10.022 using ACP aggregation, and if ACT aggregation is applied. which is even finer, it is 0.0023. Example. A split topology: K = 4 nodes, ur = 5, u2 = 3, u3 = 2, u4 = 3, arrivals at node 1, hr = 0.1,0.2 )..., 1.9, transfer probabilities t1,2 = 0.2, t1.3 = 0.3, t1,4 = 0.5, t2,o = t3,o = t4,o = 1, service rates ~1 = 1, ~2 = 0.6, ~3 = 0.4, ~4 = 0.3. The maximum absolute error of the macro probabilities is 0.022 when canonical aggregation is used. It occurs for Al = 0.5. When center aggregation is used, it is 0.002 and occurs for hl = 0.6. Example. A merge topology: K = 4 nodes, ur = 5, u2 = 3, u3 = 2, u4 = 3, arrivals at nodes 2,3, and 4: h2 = 1, A3 = 0.5, h4 = 0.5, transfer probabilities tl,o = t2,1 = tg,l = 41 = 1, service rates /..L~= /_L~= 0.5, /_L~= 1, p1 = 1.1, 1.2, . . . 12.5. The maximum absolute error of the macro probabilities is 0.016 when canonical aggregation is used. It occurs for ~1 = 2.0. When center aggregation is used, it is 0.0015 and occurs for ~1 = 2.2. Example. A symmetrical network: K = 4 nodes, capacities ut = ug = 3, u2 = u4 = 2, arrivals at nodes 1 and 3, hl = A3 =: 0.01,0.02,. . . ,0.2, transfer probabilities t1.2 = t3,4 = 0.8, t1,3 = t3.1 = 0.2, t3,u = t4,o = 1, service rates ~1 = 13 = 0.5, ~2 = ~4 = 0.1. The maximum absolute error is 0.11 when canonical aggregation is used. It occurs for hr = A3 = 0.08. When paired aggregation is used, Ic2 = ((1,2), (3,4)], it is 0.011 and also occurs for hr = h3 = 0.08. Example. The tandem topology was analyzed, but here with 9 nodes, p3 = 1, and the capacity u = 9 in all nodes, which has lo9 system states. On an Atari 104OST, the computing time for DA iteration with canonical aggregation was 238 s. This demonstrates that Markov models with a really large state space can be treated with the DA iteration technique.
4. Conclusion Product form approximations for the steady state probabilities of Markov chains and a technique for calculating them are proposed, which are suitable for huge state spaces. Unfortunately, we have not yet found error estima.tes or a proof of convergence of the iteration. But it was shown that product form approximations of any degree of accuracy exist, and we solved many models using DA iteration, among others polling, fork:join, stochastic Petri net, and blocking models with very accurate results; the iteration did never diverge. Moreover, by choosing aggregations with more aggregates  fine grain aggregation  the accuracy can be substantially ameliorated. The just mentioned open questions and different time scales, the development of a tool, and timedependent probabilities are topics of current research. The latter can be obtained via DA iteration: If the Markov chain is discrete in time, the DA solutions#) , t = 1,2, . . . , are approximate timedependent state
106
J.C. Strelen/Perjhnance
Evaluation 30 (1997) 87110
probabilities. If the Markov chain is continuous in time, randomization [5] can be applied onto the DA solutions in order to obtain approximate timedependent solutions. Different time scales are tackled with a simultaneous vector iteration method [ 17,181 which can be adapted to DA iteration.
Acknowledgements The author would like to thank B. Bark who has been engaged successfully in the DA project for years, W. Everling and M. Heckler for valuable discussions, and the students who worked at the DA idea.
Appendix A. Proof of Theorem 3, second part We begin estimating the number p = b*‘’ : From
according to (18), we obtain b2C1 5
e/n,
and using the definition of CL,
(A.11
P I c/n. Case: q(z) 5 p: From it, = /J, Iit,  q(z) ( 5 p follows, and using (A.l), one gets
(A.2)
IS,  q(z)1 i c/n. Case: p < q(z) 5 1  E : Let r,=
A
Cl(Z)5z
= 1  +/q(z)
64.3)
= 1  [email protected]
4 (z)
where cZk In q(z)/ In b, i.e. q(z) = bcz.Z, = [cZ + OSJ follows using (22), and in turn bc.5 > &Cz > bo.5 0.5 < 1,  CZ5 0.5, cz  0.5 < I, 5 cz + 0.5, and using (A.3) 1 _ be.5 < ,.Z< 1 _ $5
(A4)
Considering the Taylor series bc.5 = (1  2E)o.5 = 1 + d + $(l  8)s/%* , and observing 0 5 E ( 0.01, bo.5 5 1 + 1.0157.. . E
holds, and accordingly
where 0 5 0 _( 26,
J.C. Strelen/Performance
Evaluation 30 (1997) 87110
where 0 5 0 5
b”.5 = (1  2~)‘~~= 1  E  i(l  I~)~/~c~, bO.5 >_
107
2~,
1  1.0051 . . . 6.
Applying this to (A.4), 1.0157..
l.O051...E,
.E .
b,1 < ;d 1.0166,
(A.3
and using (A.3), 14(z)  %I < iq(z)
(A.61
follows. Case: 1  E c q (z): Here is, is unity, and therefore
I%  d4 <
6
(A.7)
holds. For calculating the approximation errors, we need bounds for the normalization constant po = l/(iil + * *. + i&).
Case: There is no state z E 2 with q(z) s 1  E:
I=
c
PO
3, +
ZEZ
q(z)U rZ)
Cq(z)+ 1 e%q(z))

ZE2
ZEZ
>l
(using(A.3))
ZEZ dzbll
q(zEc1
=
c
1:
qe=
I%,q(z)li
4(2)5/l
c
(A.8)
q(z)r,
ZE2 swr
I(Z)5cL
c
q(z)
ZE2 q(2h.l
L 1  E  S (using (A.2) and (A.5)).
(~4.9)
Observing the Taylor series 1 lE? PoFl+
=l+c+i+
(
If
where E+ilelo,
r+i (1 
l
(A.lO)
 ;)3
follows. Vice versa, from (A.8) one gets Pit I 1+
15
15, 
Z’E2 4(Z)%
q(z)1 + i
4(z) c ZEZ
I 1+ E + i.
ficq(z)
Observing the Taylor series 1 l+E+;
=1E?+
whereO(8sE+i,
(A.ll)
108
J.C. StreledPerfonnance
Evaluation 30 (1997) 87410
using (A. 1 l), one obtains po 1 1  E  i = 1  2.016~. Case: There is a state z’ E 2 with qzr > 1  6: Here, Zz, = 1. po”
c it,+ 2x2 9(r)5p
c q(z)(l ZEZ @<9(Z)5lE
q(z) ZEZ ~~qW~~c
51+c+(l+i)
Il+2E
using VW
(A.12)
 rz) + 1
c
+&
(A.13)
(23)andClr
5
1  qzt 5
E.
Observing the Taylor series 1 1+2r+&
=12E&+
where 0 5 8 I 2r + E?,
one obtains from (A. 13) po > 1  2.010..
. E.
(A. 14)
Vice versa, from (A. 12) one gets
p(ybl+
c it,+u+3 ZEZ
c ZEZ
q(z)9
q(z)>l.
(A.15)
PwzEIE
In each of the two cases, l2.0166
i pu ( 1+2.0.59...~
(A.16)
holds, and if there is a state z’ E 2 with qZ’ > 1  E, 1  2.010..
. E < po 5 1
(A.17)
holds. Now we are prepared to give the approximation Case: q(z) 5 p: n(z) = eZpo = ppo 5 ;(l
+ 2.059..
errors:
. E) 5 1.020. . . c/n,
using (25), (A.l) and (A.16). Clearly Iii, q(z)1
< l.k/n
holds. Case: P < q(z) 1. 1  c: In turn, using (24) and (A.3)
J.C. Strelen/Pelformance
4(z)  n(z) = q(z)  &PO = 4 (z) =rz
q(z)  ii, 
Evaluation 30 (1997) 87110
109
*z(po  1)
4(z)
4 (z)
 S(PO
 1) = rz  (1  rz)(Po  1)
holds, and with (AS) and (A.16) one gets ‘q(z)  n(z)’ q(z)
15 i + (1 + q2.059..
. E < 3.16.
Case: q(z) > 1  E: From rt(z) = %,po = po, and using (A.17), 14(z)  n(z)]
:: 2.16
follows. This completes the proof of (20). From (20), (19) follows directly. References [l] B. Bark, Aggregationsmethoden fur Markovketten, 1996. [2] O.J. Boxma, Workloads and waiting times in singleserver systems with multiple customer classes, Queueing Systems 5 (1989) 185214. [3] G. Ciardo and KS. Trivedi, A decomposition approach for stochastic reward net models, Perhormance Evaluation 18 (1) (1993) 3759. [4] J.F. Gierse, ZustandsraumAggregation fur fastunabhlngige stochastische Petrinetze, Master’s Thesis, Universitlt Bonn, 1995. [5] W.K. Grassmann, Finding transient solutions in Markovian event systems through randomization, in: W.J. Stewart (Ed.), Numerical Solution ofMarkov Chains, Marcel Dekker, New York (1991) 357372. [6] T. Heinrichs, B. B%rkand J.C. Strelen, Wartezeiten fur Pollingsysteme mittels numerischer Modelle, in: B. Walke and 0. Spaniol (Eds.), Messung, Modellierung und Bewertung von Rechen und Kommunikationssystemen, Springer, Berlin (1993) 173185. [7] G. Horton and S.T. Leutenegger, A multilevel solution algorithm for steadystate Markov chains, ACM Pegormance Evaluation Rev. 2Z!. [S] A. Jennings and W.J. Stewart, Simultaneous iteration for partial eigensolution of real matrices, J. Inst. Math. Appl. 15 (1975) 351361. [9] V. Jonas, Iterative Analyse von Blockiemetzen mit altemierender Aggregation und Disaggregation, Master’s Thesis, Universitlt Bonn, 1992. [lo] A.M. Kagan, J.V. Linnik and CR. Rao, Characterization Problems in Mathematical Statistics, Wiley, New York (1973). [ 1l] R. Nitzgen, AggtegationsDisaggregationsverfahren fur verallgemeinerte stochastische Petrinetze, Master’s Thesis, Universitat Bonn, 1993. [ 121 J.M. Ortega and W.C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York (1970). [ 131 B. Philippe, Y. Saad and W.J. Stewart, Numerical methods in Markov chain modeling, Opel: Res. 40 (6) (1992). [14] PJ. Schweitzer, A survey of aggregationdisaggregation in large Markov chains, in: W.J. Stewart (Ed.), Numerical Solution ofMarkov Chains, Marcel Dekker, New York (1991) 6388. [ 151 J.E. Shore and R.W. Johnson, Axiomatic derivation of the principle of maximum entropy and the principle of minimum crossentropy, IEEE Trans. Inform. Theory 26 (1) (1980) 2637. [16] W.J. Stewart, A comparison of numerical techniques in Markov modeling, Comm. ACM 21(2) (1978) 144152. [ 171 J.C. Strelen, A fast simultaneous iteration technique for the analysis of Markov chains, Intemer Bericht B/89/1, Institut fur Informatik, Universitlt Bonn, 1989.
110
J.C. Strelen/Performance Evaluation 30 (1997) 87110
[181 J.C. Strelen, Iterative Analyse von MarkovModellen mit altemierenderAggregation und Disaggregation, in: A. Lehmann and F. Lehmann (Eds.), Messung, Modellierung und Bewertung von Rechensystemen, IFB 286, Springer, Berlin (1991) 320336. [ 191 J.C. Strelen and B. B&k, An approach to the numerical analysis of multiplequeue cyclic service systems, in: G. Stiege and J.S. Lie (Eds.), Messung, Modellierung und Bewertung von Rechensystemen und Netzen, IFB 218, Springer, Berlin (1989) 7588. [20] J.C. Strelen, B. Bark, J. Becker and V. Jonas, Analysis of queueing networks with blocking using a new aggregation technique, submitted for publication.
Johann Christoph Strelen was born in Wiesbaden, Germany, in 1941. He received the Dipl.Mat. and Dr. rer. nat. degrees in Mathematics and the Habilitation degree in Computer Science from the Technische Hochschule Darmstadt, Germany, in 1968, 1973, and 1981, respectively. There he was affiliated to the Computing Center (19681973), and assistant at the Computer Science Department (19741982). In 1973 he was a post doctoral fellow at the IBM Scientific Center, Grenoble, France. Since 1982 he has been a Professor of Computer Science at the Rheinische FriedrichWilhelmsUniversitit Bonn, Germany. His research in&rests include performance evaluation, distributed systems, and simulation. Dr. Strelen is a member of the Gesellschaft fur Informatik (GI) and the Gesellschaft fur angewandte Mathematik und Mechanik (GAMM).