Identification of trends in process measurements using the wavelet transform

Identification of trends in process measurements using the wavelet transform

Pergamon PII: Compurrrs them. Engng Vol. 22, Suppl., pp. S491-S496, 1998 0 1998 Elsevier ScienceLtd. All rights reserved Printed in Great Britain 009...

752KB Sizes 0 Downloads 100 Views

Pergamon PII:

Compurrrs them. Engng Vol. 22, Suppl., pp. S491-S496, 1998 0 1998 Elsevier ScienceLtd. All rights reserved Printed in Great Britain 009%1354198 $19.00 + 0.00 SOO98-1354(98)00092-l

Identification of Trends in Process Measurements using the Wavelet Tbansform F. Flehmig, R. v. Watzdorf, Lehrstuhl fiir Proze&chnik,

W. Marquardt

RWTH Aachen, Turmstra6e 46, D-52064 Aachen, Germany

Email: {flehmig,watzdorf,marquardt}@lfpt.rwth-aachen.de

- In conjunction with model based techniques for plant operation, advanced concepts are required to derive higher value information from plant measurements. frequently, the qualitative behavior of the plant is of interest rather than sequences of measured data of high sampling rate. In thii paper a new, wavelet based approach is suggested to identify and localize polynomial trends in noisy measurements. Due to a hierarchical search in the time-frequency plane, the method is highly computational efficient. It yields both the least-squares polynomials for the identified intervals and a quantitative measure for their goodness of fit. 0 1998 Published by Elsevier Science Ltd. All rights reserved. Abstract

INTRODUCTION With the increasing importance of model based online techniques such as data reconciliation, gross-error detection or real-time optimization the identification of trends in process measurements such as steadystate has become an important task. As for continuous processes the underlying model usually is a stationary model, it is of intrinsic importance for the application of model based techniques that the process data fed to the model are indeed stationary. For other purposes, such as visualization or detection of sensor or process drift, one may be interested in the identification and localization of linear trends or trends of higher polynomial order. The distinction of slow process drift and steady-state is of particular interest. Wavelet based multiscale techniques have recently become very popular for a variety of signal processing tasks such as denoising, compression or detection of signal features. Besides superior performance for certain demanding applications, it is the outstanding advantage of wavelet methods that a variety of signal processing tasks can be embedded in one single conceptual framework such that synergy increases the performance of the overall solution compared to the stand-alone solution of each single task. Bakshi and Stephanopoulos [l] developed a geometric language for the description of process trends in one part of a series of articles. The translation of process measurements into the geometric trend language was performed by segmenting the measurements between discontinuities in higher derivatives (edges) which were detected with multiscale wavelet methods. In addition, measurement noise was removed by reconstructing the measurement between its edges, as presented in Mallat and Zhong [8]. The method suggested by Bakshi and Stephanopoulos requires biorthogonal wavelet bases and a redundant wavelet transformation which are incompatible to

most other wavelet based signal processing applications. It therefore lacks flexibility in combination with other wavelet based signal processing task that may have other requirements on the wavelet basis. Instead of detecting measurement edges the method presented here focuses on searching hierarchically intervals in which the measurement is well approximated by a polynomial of limited degree. For detected trends, the proposed algorithm yields the lea&-squares polynomial approximation to the measurement. It is based on the classical dyadic wavelet transform and can be applied with either orthogonal or biorthogonal wavelet bases. Multiscale

Methods

and Wavelets

In general, multiscale methods are approximation processes, where a sequence of nested spaces V = {Vj}j>o of increasing resolution is used. The spaces Vj are constructed such that their union Uj>,, Vj is dense in Lz, hence any signal of finite energy can be arbitrarily closely approximated on a sufficiently fine scale j. For each approximation space Vj a corresponding detail space Wj exists which constitutes the orthogonal complement of Vj in Vj+l, i. e. Vj+l = Vj $ Wj.

(1)

Recursive application of equation (1) yields an alternative expression for Vb: L-l

(2)

vL=vjoce$wi.

i=jo

Hence, the high resolution approximation space VL can be decomposed into a coarse, low resolution ap proximation space Vj, and a sequence of mutually orthogonal detail spaces of increasing scales. For practical computations, suitable bases @j := {vj,k 1k E Ij} and \kj := {$j,k 1k E Jj} are required,

s49 1

EuropeanSymposiumon ComputerAided Process Engineering--S

S492

such that Vj = SPXI 9j and Wj = span \kj holds. The approximation properties of wavelets are charHere, 1j and Jj refer to a set of integer indices. These acterized in terms of the number of vanishing mobases are constructed by dyadic dilations (2j.) and ments. Formally, this is the maximum order of the integer translations (. - k) of two generic functions, monomial for which the scaling function (or generator) ‘p and the wavelet t’$j,i(t) dt = 0 Vj,i E Z (12) (t’l *j,i) = (or mother wavelet) $: s

2jjzP(2jt - k),

holds. Due to the completeness of the approximation spaces in Lz, polynomial exactness of the scaling func2j”*(2jt - k), (4) tions is an immediate consequence, i.e. polynomial Both the scaling functions and the wavelets can be structures in the signal up to the order of r are solely constructed to have local support, i.e. the functions reproduced by the scaling functions and give rise to take the value zero except for a certain closed inter- vanishing wavelet coefficients in the multiscale repval. Under certain conditions on the generator ‘p the resentation. sequence of spaces V is called a multiresolution anaOccasionally, some interpolation properties (e. g. lysis [9]. These conditions and the orthogonality of the fact that the wavelets and the scaling functions the spaces Wj and Vj give rise to the following or- are them self piecewise polynomials) are requested thonormality relations among the scaling functions in addition to orthonormality and local support. It and wavelets: can be formally proofed that as a consequence of the interpolating properties, smoothness is incompatible Vj,l,k E Z (‘Pj,k, pj,l) = 61,k (5) with orthonormality and local support [4]. However, vi, j, kl Ez (‘$i,k, *j,l) = 61,k dj,i (6) the concept of multiresolution analysis can be suitVi,j,l,kEZ, j
‘Pj,k(t) :=

$j,k(t)

(3)

:=

(u, zl) := /u(t)V(t)

cit.

Using these basis functions and making use of equation (2), an approximation UL E VL can be expressed either in the so-called single scale representation as

(8) or as

dual wavelets qj,i. In this case, the orthonormality relations (5) to (7) hold among the primal and the dual scaling functions and wavelets. A comprehensive treatment of both orthogonal and biorthogonal wavelet basis can be found in [2, 41. One usually assumes that a particular function f is well approximated on a sufficiently fine scale L such that: f(t) = fL(t) = c

(f?

‘PL,k)

‘PL,k(t)‘

(13)

kEIr. UL

=

c

Cj,,,k

‘Pjo,k

+

c

c

j=j,

kEIj,

dj,k$j,k,

(9)

kE.l,

Having in practice a discrete measurement z at n = 2L equally spaced time points, the relation between x and f L is given through

where the latter is referred to as multiscale representation. The coefficients of both representations are k=1,...,2L, (14) f L(tk) := zk, related by an an efficient, recursive algorithm of comxk = c k=1,...,2L, (15) CL,k’PL,k(tk)r plexity 0(n), the JTast Havelet Transformation. The kEIL fast wavelet transform algorithm is baaed on recursive applications of two equations referred to as refine- where the latter expression is referred to as prefilment equations: tering equation. The coefficients c&k of the scaling functions on scale L axe calculated from the discrete p(t) = hkV’@ - k), measurement x by means of equations (13) to (15).

c

kEZ

$(t) =

xgkV’(2t

-

Ic)

where hk and gk are short filters for scaling function and wavelet, respectively. Similar to the harmonic basis functions used for Fourier transformation, the wavelet basis functions are localized in frequency. In addition, they are also localized in time due to their local support. Hence, each multiscale coefficient captures the contributions to the signal’s energy in a certain time-frequency window. It is this time-frequency decomposition of the signal, which facilitates multiple operations in signal processing.

TREND

DETECTION

Given a window of a measured process quantity, the idea of trend detection is to localize intervals in which the measured process quantity is well approximated by a polynomial. Therefore the polynomial approximation error for nearly arbitrary approximation intervals is investigated, and for small errors a trend is assumed to be present. Due to the vanishing moments of the wavelet basis (c.f. equation (12)), polynomial trends are solely rep resented by scaling functions. On the contrary, rough measurement features are represented by wavelets.

s493

European Symposium on ComputerAided Process Engineering-8 Further, due to the local support of the wavelet basis, every wavelet can be associated with a corresponding subinterval of the measurement window. These particular properties are exploited for the new algorithm. First, the measurement is transformed into the multiscale representation. Then, loosely spoken, the vector of multiscale coefficients d is searched for coefficients of small modulus. The corresponding measurement subintervals which show a distinct trend are easily determined from the support of the related wavelets. Subsequently, the neighborhood of these trend intervals is examined more thoroughly in order to localize the trend up to a predefined precision. Since polynomial approximation is sensitive to measurement noise, standard wavelet denoising techniques are applied prior to the investigation of the approximation error.

maximum degree m can at any arbitrary scale jo be described solely through scaling functions:

Q(t) := gqi t”= C CjP,,kqj,,k(t). (18) i=o

Cg,k=

gpik’0,5 i

tj

=

Uj

J21oszn

06)

a near optimal removal of process noise can be achieved under the assumption of Gaussian noise distribution. Uj denotes the noise standard deviation on scale j which is estimated from the measurement multiscale coefficients d. Since some multiscale coefficients are likely to be due to signal features, a robust estimation of the noise standard deviation is suggested: ffj

=

median(Wj,kI Ik E Jj>) 0.6745

(17)

Even if the assumption of Gaussian noise distribution does not hold, a significant reduction of the noise level can be achieved by neglecting rough and less persistent function features represented by the multiscale basis. This justifies thresholclmg even for non-Gaussian noise, although the threshold value from equation (16) may not be near optimal at all. Least-squares problem Subsequent to denoising, the least-squares polynomial and error is calculated from the wavelet representation and the approximation error is evaluated. Assume the wavelet basis as given through the scaling function p(t) and the wavelet $(t) has m vanishing moments. Then, a polynomial function Q(t) of

5 m.

(19)

i=O

Hence, the problem of fitting a polynomial in the measurement subinterval Z can be stated in wavelet notation as

Denoising Due to the fact that wavelets are well localized both in time and frequency, the distinction signal and measurement noise is well possible in the multiscale basis. Smooth measurement features are represented through the scaling functions whereas measurement noise and rough measurement features are described through the multiscale coefficients d. Moreover, the modulus of multiscale coefficients due to noise is generally smaller than the modulus of coefficients due to rough measurement features. Hence, removal of measurement noise can be performed by thresholding the multiscale coefficients. The mathematical theory for multiscale thresholding was developed by Donoho and Johnstone [5, 61 showing that for a threshold value of

k

Further, it can be shown that the coefficients cik of the scaling functions can be factored into a polynomial of the translation index Ic of degree m:

j=jo

kEJ,

(20)

/

which leads to the m + 1 normal equations for coefficients p = {pi ] 0 5 i 5 m} of wavelet representation of the least-squares polynomial Q: KA KT p = KA Pz cj, + KBjo dj,. (21) In the multiscale sense, the polynomial approximation is now investigated in dyadic measurement subintervals Z( jo, kr) where the scale js and the translation index kI determine the interval position such that: Z(jo,k,)

=[2-j0k1,2-jo(kI

+ l)]~

[tl,tz].

(22)

Depending on the wavelet basis there are Y scaling functions having their support at least partially inside a dyadic interval Z(js, k,). Let these scaling functions be indexed with k = ko, . . . , ICI.Table 1 lists v for several wavelet bases. It can be seen, that for biorthogonal wavelets v equals the number of vanishing moments whereas for orthogonal wavelets v is slightly higher than the number of vanishing moments. It is with identity for biorthogonal wavelets. v>m+ Table 1: Number of scaling functions pj,k having support inside the dyadic interval Z(js, kr) and number of vanishing moments m for different wavelet bases. 1 Basis lvlml Haar Daubechies-4

]l] 1 13 1 2

The truncation matrix Pz extracts the v coefficients of these scaling functions from the overall coefficient vector Cjo: Pz: Cj, = (Cjo,ka,. . . 1 Cjo,kl)T. (23) K is a m + 1 x v Vandermonde matrix translated to the origin of the minimization interval kr:

(24)

s494

European Symposium on Computer Aided Process Engineering-X

The matrix A comprises the integrals of scaling with the i-th moments of scaling functions k& defined function cross-products and is symmetric of size vx V: as 5r ‘P.m>ko(Pjo,ko A=

:

Jr ‘Pjo,b i0~0b1 ;

‘.,

A4, = (ti, 4) = (25)

(Pjoh . . ./, pjo,k, ‘pi,&, I i s, %i, Note that orthogonality does not apply since the integration is performed over a finite interval instead of over the whole line. However, A neither depends on the scale js nor on the position Icr of the considered interval Z(j,, k~) and can easily be calculated in advance. Integrals as appearing in A or involving other combinations of wavelets and scaling functions can be calculated following the idea of Dahmen and Micchelli [3] and the realization of Kunoth [7]. The sparse matrix B consists of the integrals of products of wavelets and scaling functions. It is of size I/ x 2L-j0: Bj,

=

s r (Pja,ko d’jo,.

.

ST ‘Pjo.ko ‘d)L,.

:

..

1

Table 2: Number of non-zero entries in Bj, depending on the relative scale L - js for different wavelet bases. L - j0 l(2131 4 15 16 Basis

(29)

Superposition of equation (28) for i = 0,. . . , m, reordering into powers of k and comparison with equation (19) yields the linear relation Q = zjo P.

where Zj, is triangular of size m + 1 x m + 1. For the detection of process trends one is interested in both, the error through polynomial approximation and in the polynomial itself. The least-squares error to the polynomial approximation in Z is given through 2 ‘%A~

=

(KTP-P~Cj,)TA(KTP-P~Cj,)

-

2 (KTP - Pz Cjo) TBjo dj, + dz Cj, dj,,

(26)

szpj,,kl ‘$L,. i ... ( Sr Pj0.1 &0,. Although orthogonality does not fully apply, vanishing entries occur due to integrals of wavelets and scaling functions with non-overlapping support. Bj, basically depends on the relative scale L - je of the interval 1. ‘Translations of Z causes permutations of the entries in Bj,, only. Typical examples for the number of non-zero entries in Bj, are given in Table 2.

7o t” r$(t) dt. J’--z

(31)

where C is a sparse 2L-jo x 2L-jo matrix comprising the integrals of wavelet-wavelet products. The complexity of equation (31) is dominated by evaluating dz Cj, dj, for which a recurrence relation can be derived. Introducing the quantity s for the aforementioned expression: S(jo,k,)

:= dECjo dj, =

5C E

j=j, kEJ, j’=j,

C

l7i,j,k $jl,k’dt,

dj,k dj’,k’

(32)

k’E.l,

the following recurrence equation holds for s: s(jo, kr) = .$jo + 1,2kr) + s(js + 1,2kl+

L-l 2

C j=jo+l

C C djo,k’dj,k kcJ, k’EJ,

J1

$jo,k’ ‘d’j,k dt.

1) +

(33)

Hence, the approximation error can be propagated from fine to coarser scales. It turns out that the calMultiplication of equation (21) with (KAKT)-l culation of s for all dyadic intervals is of complexyields the following equation for p: ity 0(n) for orthogonal wavelets and of complexity P = (KAK~)-~( KAPI cj, + KBjo dj, ) 0(nlog n) for biorthogonal wavelets. The quantity s (27) Since A and K are small matrices, the inversion of serves for a first screening of the true approximation KA KT or the multiplication with (KA KT)-l is of error c2. Consequently, in order to avoid unnecessary low complexity. Moreover, the inverse can be calcu- computations, equations (27) and (31) are evaluated lated in advance as it is independent of je, k,. Hence, for those intervals only, where s takes small values. For steady-state detection, the Haar wavelet is the complexity of equation (27) is dominated by the the appropriate choice since it has m = 0 vanmultiplication with Bj, . The relation between the polynomial coefficients p ishing moments. For this special case the deteccalculated on scale jo and the corresponding poly- tion equations greatly simplify. Since the Haar wavenomial coefficients q in the time domain can be de- let is fully supported in dyadic intervals, the integrived from straightforward manipulations of the inner rals of wavelet-wavelet and wavelet-scaling function products of the i-th monomial of Q(t) with the scaling products vanish. Hence, the matrices Bjo and Cj, are zero and s( jo, k1) exactly equals the least-squares erfunctions on scale js. ror s2 (js, kr). Moreover, as A and Zj, are scalar the 4i (t”> ‘pj,,,k) = calculation of the least-squares steady-state value (equation 27) simplifies to: 2-j0~~+l/2~qi

2 :

r=o

(‘1

Ii& (2jvp,

(28)

Q=QO=Zj,P=Zj,P~Cj,.

(34)

European Symposium on Computer Aided Process Engineering-8 Hierarchical search and interval merging In the previous section, a technique was derived to calculate the least-squares polynomial and the approximation error in dyadic subintervals Z(js, kr) directly from the multiscale representation d and c. The quantity s provides an estimate of the approximation error and allows a preliminary screening of such intervals in order to reduce the computational burden. Hence, an efficient hierarchical search for intervals with small s is performed, starting from a coarse scale js = jstart, sweeping towards finder scales until an interval comprising a trend is located. Subsequently, the neighborhood of such an interval is investigated in order to sharpen the interval boundaries, again starting from coarse scales and moving towards finer scales. For this purpose, a second indicator for merging adjacent dyadic intervals is required. This indicator can be derived from a total least-squares approximation through two least-squares approximations. Consider to polynomials q(l) and [email protected]) which are leas-squares approximations in the adjacent intervals 2, and 12, respectively. Standard polynomial algebra in the time domain gives rise to the joint polynomial q(lu2) which is least-squares in the joint interval Xi U 12: PI

(1u2) = Tl q(‘) + T2 q(2),

+Tz)q

(35)

where I(Z) denotes the length of the trend interval. For testing possible interval merging e2 is replaced by i2 from equation (37). The complete algorithm is summarized in the following: 1.

Perform wavelet transformation measurement z.

2.

Remove measurement noise via thresholclmg.

3.

Calculate the approximate least-squares error s from equation (33) for all dyadic s&intervals.

4.

Start search on coarsest scale jc = jmin: For all search scales je to terminal search scale &d:

(4

(b)

(cl

(4

with TX=

j,, t”dt

...

:

...

( s,, tm dt

...

z& Pdt ,A = {1,2}.

; & t2m dt 1

(36) An upper bound for the joint approximation error is given through Z2(Xi U 12): E^2(Z~U22)=(E(Z~)+~~,1)2+

$,,

(4~2)+432)2,

(37)

(A) _ q(iU2))TTX (q(A) _ q(‘u2)). := (Q (38)

It indicates whether adjacent trend intervals may be merged.

(e)

for a given

Find on current search scale new trend interval candidates by checking s(je,lc) 5 2-j, Tol. s(jo,k) belonging to a trend already detected on coarser scales are not tested in order to reduce the complexity on fine scales. Calculate true approximation error c2 and approximation polynomial (equations (31) and (27)) for all new trend interval candidates. Discard trend interval candidates with c2 > 2+ Tol. For all remaining trend interval candidates: Calculate joint approximation polynomial (equation (35)) and error (equation (37)) for all neighboring trend candidate and previously detected trend intervals. If merge is possible on both sides of the trend candidate interval, merge according to minimum joint approximation error. Detect new trend intervals for all remaining interval candidates if current search scale j0

<

hew.

Examples

The application of the outlined trend detection method is illustrated using measurements derived from a industrial MSF desalination plant. Two examples are presented, one for steady-state detection in Fig. 1 and one for detection of a linear trend in Fig. 2. For both examples, the measurement in conjunction with the detected trend intervals is displayed in the upper plot. In the lower plot, the multiscale representation of the approximation error indicator s(jo,kr) as calculated from equation (33) is presented in order to provide further insight in the detection method. Bright entries in the multiscale plot correspond to high values of s( js, Icr) indicating a high polynomial approximation error in the dyadic interval Z(jo,kI) (equation (22)). Scale je and position ICI of the interval Z are given through the vertical and horizontal position of the entries, respectively. All dyadic intervals belonging to the same trend in(39) terval are enclosed by a white line in the multiscale

Overall Algorithm Initially, a trend is detected in a dyadic interval on a coarse scale. The trend interval boundaries are sharpened in subsequent steps by merging adjacent dyadic intervals on finer scales. The parameter j,,, sets the finest scale on which the hierarchical search terminates. Obviously, j,,, determines the minimum length of a trend interval that can be detected at all. The merging of adjacent dyadic intervals is controlled by a second parameter, je,& that determines the finest scale to be considered. The introduction of this second parameter enhances the accuracy of the trend localization while suppressing the isolated trends of small width. Since the influence of residual noise increases with increasing trend interval length, the condition for detecting a trend in the interval Z is: ~~(2) < 1(Z) Tel,

s495

S496

EuropeanSymposiumon ComputerAided Process Engineering--X

CONCLUSION

Figure 1: Detection of steady-state trends in the Haar basis. Plot of measurement with detection intervals and multiscale plot of s(j,, IEI).

A new technique for the localization and quantification of polynomial process trends has been proposed. Subsequent to a wavelet transformation and denoising of the measurement, candidate intervals are identified by a hierachical search in t,he time-frequency plane. Correspondingly, the coefficients of the trend polynominals are determined by least -squares approximation in the time-frequency representation. In conjunction with the trend polynominals, the new techniques provides a measure of the deviation of the trend polynominal in terms of the energy norm. The new technique is highly computational efficient,. Hence. even though the wavelet transformation is in principle ~II operation executed on a stored vector of measurement,s rather than a moving data window, the high efficiency of the algorithm allows for on line application. In case of the Haar basis the technique greatly simplifies. The method has merely three tuning parameters, which have a direct physical significance. This is the termination scale j,,, which corresponds to the minimum width of trend int,ervalls t,o be detected. The detection tolerate To1 is a measure of the admissible Lz deviation between the measurement and the trend. Finally, the paramet,er jend determines the maximum resolution for boundary sharpening and is of minor importance.

REFERENCES PI B.R. Bakshi and G. Stephanopoulos.

Representation of process trends - III. Multiscale extraction of trends from process data. Computers them.. Engng., 18(4):267 - 302. 1994.

PI C.K. Chui. An Introduction to Wavelets. Academic Press, New York, 1992.

Figure 2: Detection of linear trends in the Daubechies-4 basis. Plot of measurement with detection intervals and multiscale plot of s(j,, Icr).

plot giving insight in how the hierarchical search algorithm refines the boundaries of trend intervals with finer scales. Fig. 1 shows an example of detecting steady-state trends using the Haar wavelet. Three steady-state trends are identified. This example demonstrates that the hierarchical search algorithm succeeds in sharpening the boundaries of each of the left and right detection interval towards the limits drawn by high deviations from steady-state. In the second example as depicted in Fig. 2, detection of linear trends was performed for the same process measurement. For this purpose, the Daubechies4 basis with m = 1 vanishing moment was selected. The two steady-state intervals on the left and on the right side of the measurement window are detected as accurately as in the first example. In addition, the larger linear trend is detected in the middle of the measurement window. Due to a measurement outlier, the right boundary of the trend interval is not further sharpened.

131 W. Dahmen and C. A. Micchelli.

Using the refinement equation for evaluating integrals of wavelets. Siam J. ,Vumer Anal., 30:507-537. 1993.

141 I. Daubechies. Ten Lectures on Wavedets. SIAM, Philadelphia. Penn., 1992.

151 D. L. Donoho and I. M. Johnstone. Ideal spatial adaptation via wavelet shrinkage. 81:425-455, 1994.

BiometriLa.

PI D. L. Donoho, I. M. Johnstone, G. Kerky-

acharian, and D. Picard. Wavelet shrinkage: Asymptopia? Journal of the Royal Statistical Society, Series B, 57:301-369, 1995. Computing refinable Integrals Documentation of the Program. Texas A&M Preprint , 1995.

171 A. Kunoth.

181 S. Mallat and S. Zhong. Characterization of signals from multiscale edges. IEEE Transa&ons on Pattern Analysis and Machine Intelligence, 14(7):710 - 732, 1992.

A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Machine Intel., 11(7):674-693, 1989.

191 S.G. Mallat.