Some simulations and applications of forecasting long-memory time-series models

Some simulations and applications of forecasting long-memory time-series models

Journal of Statistical Planning and Inference 80 (1999) 269–287 www.elsevier.com/locate/jspi Some simulations and applications of forecasting long-m...

462KB Sizes 199 Downloads 576 Views

Journal of Statistical Planning and Inference 80 (1999) 269–287

www.elsevier.com/locate/jspi

Some simulations and applications of forecasting long-memory time-series models a Departamento

ValdÃerio Anselmo Reisena;∗ Silvia Lopesb

de Estatstica, Ã CCE Universidade Federal do Espirito Santo, Goiabeiras, Av. Fernando Ferrari, S=N, VitÃoria, E.S., 29070-900, Brazil b Departamento de Estatstica, Ã Instituto de MatemÃatica-UFRGS, Av. Bento Goncalves, 9500, 91540-000, Porto Alegre, RS, Brazil Received 13 September 1997; received in revised form 16 June 1998; accepted 16 June 1998

Abstract In this paper, we show some results of forecasting based on the ARFIMA(p; d; q) and ARIMA(p; d; q) models. We show, by simulation, that the technique of forecasting of the ARIMA(p; d; q) model can also be used when d is fractional, i.e., for the ARFIMA(p; d; q) model. We also conduct a simulation study to compare the two estimators of d obtained through regression methods. They are used in the hypothesis test to decide whether or not the series has long memory property and are compared on the basis of their k-step ahead forecast errors. The properties of long-memory models are also investigated using an actual set of data. ? 1999 Elsevier Science B.V. All rights reserved. Keywords: Long memory; Fractional; Forecasting; Smoothed and periodogram regressions

1. Introduction Fractionally integrated processes have recently proved to be useful tools in the analysis of time series with long-range dependence. The ARIMA(p; d; q) process shows the characteristic of long memory when the parameter d (the degree of di erencing) takes values between (0:0; 0:5). The process is short memory when d ∈ (−0:5; 0:0], see for instance McLeod and Hipel (1978) and Hosking (1981). The characteristics of long or short memory for the ARFIMA(p; d; q) process can be seen in the shapes of the spectral density and the autocorrelation functions. For d ∈ P (0:0; 0:5), j |j | diverges, with j being the autocorrelation function of the process, and as the frequency approaches zero the spectral density becomes unbounded. Hence the type of dependence between observations is determined essentially by the fractional ∗

Corresponding author. Tel.: +55-27-3256974; fax: 55-27-3256974. E-mail addresses: [email protected] (V.A. Reisen); [email protected] (S. Lopes)

c 1999 Elsevier Science B.V. All rights reserved. 0378-3758/99/$ - see front matter PII: S 0 3 7 8 - 3 7 5 8 ( 9 8 ) 0 0 2 5 4 - 7

270

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

parameter d. There are many empirical researches related to the estimation of d, e.g., Geweke and Porter-Hudak (1983), Hassler (1993), Reisen (1993, 1994), Chen et al. (1994), Anh and Kavalieiris (1994) and Taqqu et al. (1995). A good review of a long-memory process may be found in Beran (1994). This paper is concerned with the problem of a forecasting time series that possibly exhibits long-memory features. We show the suitability of the technique of forecasting with standard ARIMA processes to the case where the underlying process has long-memory properties, i.e., for the ARFIMA process. By simulation, we investigate the use of two estimates of d in the forecast procedure. The estimators are used to make the decision whether or not the series has long-memory properties and are also applied to obtain k-step ahead forecasts. This work also includes an empirical application using both a long-memory model and a conventional ARMA model. The outline of this paper is as follows: in Section 2, we summarise results related to the ARFIMA(p; d; q) model and the estimation of d. Section 3 describes the technique of forecasting for long-memory models and presents simulated results. In Section 4, the wind speed data is analysed by the use of long- and short-memory models. Finally, in Section 5, we present the conclusion of this work. 2. The ARFIMA( p; d; q) model We now summarise some results for the ARFIMA(p; d; q) model and for estimating the fractional parameter d. A more comprehensive account can be found in Hosking (1981) and Reisen (1994). Let {t } be a white-noise process with E(t ) = 0; 0 ¡ 2 ¡ ∞ and B the back-shift operator, i.e., BXt = Xt−1 . Let (B) and (B) be polynomials of orders p and q, respectively, where (B) = 1 − 1 B − · · · − p Bp and (B) = 1 − 1 B − · · · − q Bq have all their roots outside the unit circle. If {Xt } is a linear process satisfying (B)(1 − B)d Xt = (B)t

for d ∈ (−0:5; 0:5);

(2.1)

then {Xt } is called the ARFIMA(p; d; q) process with d being the degree of integration. In this work {Xt } is a linear process without a deterministic term. The term (1 − B)d , for d ∈ R, is deÿned by a binomial expansion   ∞ P d d (−B) k = 1 − dB − (1 − d)B2 − · · · : (1 − B)d = 2! k k=0 According to Hosking (1981) the ARFIMA(p; d; q) process deÿned in Eq. (2.1) is stationary, invertible and its spectral density, f(w), is given by f(w) = fu (w)(2 sin(w=2))−2d ;

w ∈ [ − ; ];

(2.2)

where the function fu (w) is the spectral density of an ARMA(p; q) process. Hassler (1993) shows that the coecients of the inÿnite MA representation of the ARFIMA(p; d; q) process, k , decay hyperbolically in the sense that for some real

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

271

constant b, k

∼ = k d−1

b (d − 1)!

as k → ∞:

(2.3)

The sum of the squares of coecients (2.3) is ÿnite hence {Xt } is a general linear process. 2.1. Estimates of d Consider the set of harmonic frequencies wj = (2j=n); j = 0; 1; : : : ; [n=2] where n is the sample size. By taking the logarithm of the spectral density f(w) and adding ln fu (0) to both sides we have ln f(wj ) = ln fu (0) − d ln(2 sin(wj =2))2 + ln{fu (wj )=fu (0)}:

(2.4)

We consider two estimators for the parameter d which are obtained by the regression equations constructed from Eq. (2.4). They are the estimates of d by using the periodogram function and by using the smoothed periodogram function. The former, hereafter denoted by dˆp , was suggested by Geweke and Porter-Hudak (1983). They use the periodogram function as an estimate of the spectral density in Eq. (2.4). The authors show that dˆp is asymptotically normally distributed with E(dˆp ) = d and variance given by var(dˆp ) =

2 ; Pg(n) 6 i=1 (xi − x)  2

where g(n) is a function of n and xj = ln(2 sin(wj =2))2 . The latter referred to in this paper as dˆsp , was suggested by Reisen (1994). This regression estimator is obtained by replacing, in Eq. (2.4), the spectral function by the smoothed periodogram function with the Parzen lag window. Reisen (1994) shows that ˆ dsp is asymptotically normally distributed with E(dˆsp ) = d and dˆsp variance given by m ; var(dˆsp ) ≈ 0:539285 Pg(n) n i=1 (xi − x)  2 where m is a function of n and usually referred to as the truncation point in the Parzen lag window (m = n , 0 ¡ ¡ 1). Because the autocorrelation function of the ARFIMA(p; d; q) process is not summable for d between (0:0; 0:5), the theoretical results relating to both estimates by the regression hold only in the case where d is negative. However, many researchers have shown by simulations that these estimators can also be applied in the case d ¿ 0 (see, for instance, Reisen, 1994). 3. Forecasting the process In this section, we show that the technique of the minimum mean-square-error forecasts for the ARIMA(p; d; q) model when d is an integer can also be used when d is

272

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

fractional. As noted in the theoretical results described in Section 2, the ARFIMA(p; d; q) model is invertible for d ¿ − 0:5. This enables us to obtain the k-step ahead forecasts for the ARFIMA(p; d; q) even in the case when d is fractional. Since the sum of the squares of the coecients of the inÿnite MA representation is ÿnite, the variance of the forecast error is also ÿnite. Suppose that we have the observations Xn ; Xn−1 ; : : : ; and we would like to forecast the value Xn+k , k-step ahead (k time units into the future, k ¿ 0). The minimum mean-square-error forecast, denoted by Xˆn (k), is then given by Xˆn (k) = E(Xn+k | Xn ; Xn−1 ; : : :):

(3.1)

It can be shown that (see e.g. Wei, 1990) the result above minimises the mean-squareerror E(Xn+k − Xˆn (k))2 of the forecast. The forecast error en (k), is given by en (k) = Xn+k − Xˆn (k):

(3.2)

3.1. Forecasting the ARFIMA (p,d,q) model Let {Xt } be an ARFIMA(p; d; q) model as deÿned in Section 2. By using Eq. (3.1) and the MA and AR inÿnite representations given for the process we may derive the forecast Xˆn (k), the forecast error en (k), and the variance of the forecast error, var(en (k)), for all lead times k (k ¿ n) as follows: The inÿnite AR and MA representations at time t + k are Xt+k = −

Xk = where

k

∞ P j=0

∞ P j=1

j Xt+k−j + t+k ;

(3.3)

j t+k−j ;

(3.4)

and k are the coecients of B k in the expansions of

(B) =

(B) (1 − B)−d (B)

and

(B) =

(B) (1 − B)d ; (B)

respectively. Let t = n and by applying the result of Eq. (3.1) in Eq. (3.3), and using properties of the conditional expectation, we may obtain Xˆn (k) as follows: Xˆn (k) = E(Xn+k | Xn ; Xn−1 ; : : :) = −

∞ P j=1

j Xˆn (k − j)

for k¿1:

(3.5)

In the above, ˆn ( j) = E(n+j | Xn ; Xn−1 ; : : :) = 0; Xˆn (j) = E(Xn+j | Xn ; Xn−1 ; : : :); for j¿1, and Xˆn+j ( j) = Xn for j60. Now we may obtain the forecast error en (k) using the inÿnite MA representation given by Eq. (3.4) as follows. We have Xˆn+j =

∞ P j=z

j n+k−j :

(3.6)

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

273

In the above, ˆn ( j) = 0; for j¿1, and ˆn ( j) = n+j ; for j60. The forecast error is en (k) = Xn+k − Xˆn (k) = By Eq. (2.3),

P∞

j=0

2 j

k−1 P j=0

j n+k−j :

¡ ∞ so var(en (k)) = 2

(3.7) P k−1 j=0

2 j

is ÿnite for all k.

3.2. Simulated study We conduct two simulation studies in this section. The ÿrst one shows graphically the properties of forecasting including the forecasts, the conÿdence limit of the forecasts, and the estimated variances of the forecast error for one- and two-step ahead forecasts. The second one investigates the estimates of the fractional parameter d, the hypothesis test, the estimates of the resulting coecients of the ARMA model after removing the fractional parameter of the original series and the forecast results comparing ARMA and ARFIMA models. In both of these studies many simulated series for di erent values of the degrees p; q and the fractional d were carried out. Since the results of all simulations have shown that the use of the theory of forecasting of an ARIMA(p; d; q) model can also be used for the ARFIMA(p; d; q) process and due to limitation of space we only present herein analysis related to the ARFIMA(0; d; 0), ARFIMA(1; d; 0) and ARFIMA(0; d; 1) processes. The series were generated by using the procedure suggested by Hosking (1984) with t ∼ N(0; 1). The coecients of the inÿnite MA and AR representations of the ARFIMA(0; d; 0) process are given by k

=

(k + d − 1)! k!(d − 1)!

and

k =

(k − d − 1)! : (k)!(−d − 1)!

The coecients of the inÿnite MA and AR representations of the ARFIMA(1; d; 0) are given by k

=

(k + d − 1)! F(1; −k; 1 − d − k; ) k!(d − 1)!

and k =

(k − d − 2)! (k − 1)!(−d − 1)!

  (1 + d) 1−− : k

In the above, the function F(a; b; c; d) is given by Hosking (1981) and  is the coefÿcient of the AR(1) model. 3.2.1. Graphical results The properties of forecasts for simulated samples of the ARIMA(0; d; 0); d = 0:33, and ARIMA(1; d; 0); d = 0:3 and  = 0:4, models are shown graphically in Figs. 1 and 2, respectively. The sample size is 300 and the results refer to the mean value over 20 replications. The inÿnite sum in Eq. (3.6) is truncated at j = 300. The forecasts are

274

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Fig. 1. ARFIMA(0; d; 0); d = 0:33 (a) Forecasting (b) Conÿdence intervals (c) Estimates of the forecast variances (real variances: one-step ahead 2 = 1:0, two-step ahead 2 = 1:1).

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

275

Fig. 1. (continued).

√ made for observations X241 to X300 . The 95% limits are given by Xˆn (k) ± 1:96(= 20). The ÿgures show the forecasts, the conÿdence limits of the forecasts, and the estimated variances of the forecast error for one- and two-step ahead forecasts. A comprehensive study related to the estimates of the variance of k-step ahead forecast errors is presented by Reisen and Abraham (1998). The residual analysis of the one-step ahead forecast was checked as follows: (a) Plotting the normal scores versus residuals. With normal distribution data, the plots should fall approximately on a straight line. (b) Using the sample correlation coecient between the residuals and the corresponding normal scores. (c) Using the run test for checking if the residuals are independent. The plots of the normal scores versus residuals for the observations X241 to X300 of both simulated examples, the ARFIMA(0; d; 0) and the ARFIMA(1; d; 0) models, are reasonably straight. The sample correlation coecient between the residuals and the corresponding normal scores was calculated to be 0.988 for both cases. We would not reject normality at any of the given signiÿcance levels. In both the examples, the test of randomness (the run test) gave the observed signiÿcance level equal to 0.6234 (this probability corresponds to the ordinate 0.31 value of the standard normal distribution). Consequently, we see that the residuals pass this test of independence. The residual analysis of the one-step ahead forecast shows that the residuals preserve the characteristics of a white noise process.

276

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Fig. 2. ARFIMA(1; d; 0); d = 0:3;  = 0:4 (a) Forecasting (b) Conÿdence intervals (c) Estimates of the forecast variances (real variances: one-step ahead 2 = 1:0, two-step ahead 2 = 1:49).

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

277

Fig. 2. (continued).

3.2.2. Modelling and forecasting simulated results We generate a time series of n = 300 observations from ARFIMA(0; d; 0), ARFIMA (1; d; 0) and ARFIMA(0; d; 1) processes with 0 ¡ d ¡ 0:5. In our simulations we investigate the following sequence: (a) The hypothesis test to decide whether or not the process is long memory. (b) The bias of the estimates of the coecients of the ARMA model after removing the fractional parameter of the series. (c) The forecast analysis by comparing ARMA and ARFIMA models. In (a) we conduct the test for H0 : d=0; H1 : d 6= 0 where the series has the fractional parameter d ∈ (0:0; 0:5). To test the hypothesis we use the asymptotic normal distribution of both regression estimators as described in Section 2 with the level of signiÿcance =5%. The results are summarised in Table 1. We consider 200 replications, the number of regression observations is g(n)=n0:5 for both estimators and the truncation point in the Parzen lag window is ÿxed at m = n0:9 (as suggested in Reisen, 1994). Table 1 presents the mean value of the estimates (denoted by d∗ ), their standard deviations, s, the frequencies of rejecting d = 0 by using the asymptotic and the least-square sample variances (the frequencies are denoted by f1 and f2 , respectively) and the estimate of the mean-square error (mse). Both estimators have comparable bias and it depends on the coecients of the model. In the ARFIMA(0; d; 1) model the bias (mean) of both estimators increase substantially for large  and also the frequency of rejecting d = 0 decreases (an extended study related to the bias of these estimators and others can be

278

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Table 1 Results of estimating d and testing d = 0 (0 ¡ d ¡ 0:5) for the ARFIMA(1; d; 0) and ARFIMA(0; d; 1) models d = 0:2

d = 0:3

d = 0:4

Smooth Periodogram Smooth Periodogram Smooth Periodogram periodogram periodogram periodogram d∗ s  =  = 0 f1 f2 mse

0.1486 0.1549 46.75% 59.00% 0.0266

0.1902 0.2063 16.50% 22.00% 0.0426

0.2420 0.1513 70.50% 79.25% 0.0262

0.3027 0.1913 53.00% 41.25% 0.0365

0.3562 0.1609 87.00% 90.00% 0.0278

0.4185 0.1997 56.00% 61.75% 0.0401

 = 0:2

d∗ s f1 f2 mse

0.1645 0.1529 52.25% 62.75% 0.0246

0.2135 0:1973 18.50% 22.00% 0.0392

0.2736 0.1545 77.50% 83.50% 0.0245

0.3269 0.1931 35.75% 46.00% 0.0379

0.3655 0.1630 89.00% 91.50% 0.0279

0.4231 0.2128 60.50% 64.75% 0.0457

 = 0:3

d∗ s f1 f2 mse

0.1600 0.1619 53.75% 63.25% 0.0277

0.2023 0.2128 18.75% 25.50% 0.0452

0.2567 0.1609 73.25% 79.50% 0.0277

0.3118 0.2017 34.00% 41.50% 0.0407

0.3830 0.1637 91.25% 92.05% 0.0270

0.4439 0.2090 62.25% 64.50% 0.0455

 = 0:5

d∗ s f1 f2 mse

0.1889 0.1500 58.50% 67.00% 0.0226

0.2421 0.1942 21.00% 31.00% 0.0394

0.2941 0.1616 77.25% 81.75% 0.0261

0.3609 0.1960 42.75% 48.75% 0.0420

0.3767 0.1703 89.50% 91.25% 0.0294

0.4354 0.2154 58.00% 62.25% 0.0475

 = 0:7

d∗ s f1 f2 mse

0.2693 0.1507 75.25% 81.00% 0.0275

0.3166 0.1949 35.00% 42.75% 0.0515

0.3739 0.1584 91.75% 94.25% 0.0305

0.4278 0.2052 58.25% 64.75% 0.0584

0.4792 0.1565 97.50% 97.25% 0.0307

0.5338 0.2030 77.50% 79.50% 0.0590

 = 0:2

d∗ s f1 f2 mse

0.1618 0.1572 54.00% 64.50% 0.0261

0.2112 0.2081 17.50% 24.25% 0.0433

0.2275 0.1587 66.75% 74.75% 0.0303

0.2769 0.2022 29.00% 34.35% 0.0413

0.3415 0.1568 89.00% 91.75% 0.0279

0.4012 0.1987 52.25% 61.75% 0.0395

 = 0:3

d∗ s f1 f2 mse

0.1519 0.1565 51.25% 63.75% 0.0268

0.1986 0.2005 16.00% 23.75% 0.0401

0.2286 0.1612 68.50% 77.00% 0.0311

0.2751 0.2140 31.50% 38.00% 0.0463

0.3532 0.1625 88.50% 91.75% 0.0285

0.4171 0.2069 59.85% 64.50% 0.0430

 = 0:5

d∗ s f1 f2 mse

0.1028 0.1459 36.75% 52.00% 0.0307

0.1411 0.1918 7.75% 14.75% 0.0402

0.1966 0.1520 58.75% 67.50% 0.0337

0.2398 0.1977 23.25% 29.25% 0.0427

0.3135 0.1694 82.50% 89.25% 0.0361

0.3718 0.2202 50.25% 57.75% 0.0401

 = 0:7

d∗ s f1 f2 mse

0.0149 0.1685 27.25% 42.25% 0.0626

0.0650 0.2175 8.50% 10.00% 0.0654

0.1450 0.1634 50.00% 59.50% 0.0507

0.2026 0.2145 16.50% 23.00% 0.0554

0.2355 0.1565 68.25% 77.00% 0.0515

0.2826 0.2020 31.00% 37.00% 0.0589

0.0867

0.2020

0.0867

0.2020

0.0867

0.2020

Asymptotic standard deviation

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

279

found in Reisen, 1994, Chen et al., 1995 and Hassler, 1993). The least-square sample variances underestimate the asymptotic variances and this is the reason f2 is always bigger than f1 . The estimator dˆsp always has bigger frequencies to reject H0 : d = 0 than dˆp . The frequencies increase when d and  get bigger and the bias of both estimators also increase. The results show that dˆsp is more powerful than dˆp for testing H0 : d = 0 (0:0 ¡ d ¡ 0:5). The results related to (b) and (c) are presented in Tables 2 and 3. These results refer to the analysis of modelling and forecasting ARFIMA(1; d; 0) process. In the modelling step, we estimate the fractional parameter d, perform the test d = 0 against d 6= 0 and identify and estimate the parameters of the resulting ARMA model. To identify the appropriate number of parameters the AIC criterion (Akaike, 1973) is used and to estimate their values we use the MINITAB package and NAG-Fortran subroutines. We consider simulated series where the use of dˆp does not reject d=0 while dˆsp gives d 6= 0 with =5%. Hence, for the estimated value given by dˆsp , the original series is di erentiated. These results are shown in Table 2 from columns 1 to 4 (z refers to the statistic test and ˆ2 is the estimate of 2 ). We investigate several ARFIMA(1; d; 0) processes. However, we present only the cases when d=0:2; 0:3; 0:4 and =0:2; 0:5; 0:7 since the pattern is the same in other cases. By considering d = 0 (as suggested by dˆp ) the ARMA(p; q) is ÿtted to the original series. We notice that the order of the model varies and the estimate of the coecients tends to overestimate the true value. This is expected since the original series is a long-memory process. When the original series is di erentiated by dˆsp , i.e., dˆsp suggests d 6= 0, the AIC criterion is used to identify the order of the model. Crato and Ray (1996) and Smith et al. (1997) present a study related to the selection criterion to identify the order of the ARFIMA model and they point out that the bias of the estimators of d can cause model selection criteria to select an incorrect ARFIMA speciÿcation. However, in our simulations the AIC criterion indicates, in all considered cases, the same number of the parameters as the original series, i.e., the resulting model is the AR(1) model. The bias of the coecients of the resulting AR(1) model depends on the values of d and . The bias increases when the parameter values also increase. A comprehensive study related to the bias of the coecients after the series have been di erentiated is presented in Reisen et al. (1998). In this work the authors compare both regression estimators and show that the bias of the coecients of the resulting ARMA(p; q) is smaller when using dˆsp . The forecast analysis by comparing ARMA and ARFIMA models are presented in Table 2 (columns 5 –8). The forecast k = 1; 3 and 5 step ahead are made from X260 to X300 by using an approximation equation of Eq. (3.5) which is given by Xˆn (k) ∼ =

n+k−1 P j=1

ˆj Xˆn (k − j):

Then we obtain the following quantities. The mean forecast error (mfe) P300−k j=260 ej (k) ; e(k)  = (41 − k)

(3.8)

280

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Table 2 Results of modelling and forecasting ARFIMA(1; d; 0) model dˆi

Model

i=p

0.199 z = 0:889

AR(2) 0.399 (0.057) 0.185 (0.057) ˆ2 = 1:087

mfe s mse

−0:1788 (1.089) (1.209)

−0:2766 (1.136) (1.368)

−0:3137 (1.182) (1.496)

i = sp

0.199 z = 2:924

AR(1) 0.239 (0.057) ˆ2 = 1:100

mfe s mse

−0:1641 (1.082) (1.197)

−0:2431 (1.159) (1.404)

−0:2504 (1.203) (1.510)

i=p

0.219 z = 1:760

AR(1) 0.717 (0.040) ˆ2 = 0:868

mfe s mse

−0:0643 (0.926) (0.862)

−0:1759 (1.612) (1.381)

−0:1777 (1.164) (1.386)

i = sp

0.181 z = 4:480

AR(1) 0.528 (0.049) ˆ2 = 0:887

mfe s mse

−0:0469 (0.925) (0.858)

−0:1353 (1.157) (1.358)

−0:1246 (1.170) (1.385)

i=p

0.183 z = 0:960

AR(1) 0.818 (0.033) ˆ2 = 0:822

mfe s mse

−0:0685 (1.093) (1.199)

−0:1096 (1.557) (2.437)

−0:0396 (1.596) (2.550)

i = sp

0.194 z = 1:966

AR(1) 0.670 (0.043) ˆ2 = 0:829

mfe s mse

−0:0549 (1.094) (1.200)

−0:0731 (1.576) (2.471)

0:0190 (1.611) (2.596)

i=p

0.330 z = 1:570

ARMA(1.1) 0.774 (0.061) 0.315 (0.092) ˆ2 = 0:794

mfe s mse

0:1919 (0.834) (0.733)

0:2947 (0.825) (0.768)

0:4269 (0.782) (0.794)

i = sp

0.287 z = 4:290

AR(1) 0.206 (0.058) ˆ2 = 0:790

mfe s mse

0:1663 (0.835) (0.723)

0:2525 (0.822) (0.739)

0:3565 (0.793) (0.755)

i=p

0.312 z = 1:850

AR(1) 0.785 (0.036) ˆ2 = 0:703

mfe s mse

−0:1924 (0.833) (0.731)

−0:4264 (1.155) (1.516)

−0:5248 (1.260) (1.863)

i = sp

0.308 z = 4:88

AR(1) 0.489 (0.051) ˆ2 = 0:712

mfe s mse

−0:1606 (0.851) (0.750)

−0:3413 (1.220) (1.606)

−0:3920 (1.357) (1.997)

i=p

0.278 z = 1:840

ARMA(1.1) 0.822 (0.038) −0:167 (0.066) ˆ2 = 0:637

mfe s mse

−0:0880 (0.863) (0.754)

−0:2490 (1.325) (1.819)

−0:2290 (1.306) (1.758)

i = sp

0.328 z = 6:210

AR(1) 0.625 (0.046) ˆ2 = 0:646

mfe s mse

−0:0370 (0.879) (0.776)

−0:1060 (1.360) (1.386)

−0:0263 (1.381) (1.909)

i=p

0.269 z = 1:660

AR(2) 0.612 (0.057) 0.165 (0.057) ˆ2 = 0:442

mfe s mse

0:0217 (0.590) (0.349)

0:094 (0.738) (0.554)

0:080 (0.788) (0.627)

i = sp

0.385 z = 5:500

AR(1) 0.254 (0.056) ˆ2 = 0:452

mfe s mse

0:009 (0.600) (0.360)

0:046 (0.766) (0.589)

0:012 (0.833) (0.694)

ARFIMA(1; d; 0)

0.2, 0.2, 0.0

0.5, 0.2, 0.0

0.7, 0.2, 0.0

0.2, 0.3, 0.0

0.5, 0.3, 0.0

0.7, 0.3, 0.0

0.2, 0.4, 0.0

1-step

3-step

5-step

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

281

Table 2. (continued). dˆi

Model

i=p

0.453 z = 0:930

AR(1) 0.764 (0.037) ˆ2 = 0:421

mfe s mse

−0:035 (0.497) (0.248)

−0:078 (0.613) (0.382)

−0:076 (0.604) (0.372)

i = sp

0.378 z = 5:710

AR(1) 0.370 (0.050) ˆ2 = 0:422

mfe s mse

−0:016 (0.500) (0.251)

−0:028 (0.627) (0.395)

−0:005 (0.633) (0.401)

i=p

0.354 z = 1:630

ARMA(1.1) 0.885 (0.029) −0:188 (0.062) ˆ2 = 0:452

mfe s mse

0:064 (0.628) (0.399)

0:133 (0.927) (0.865)

0:197 (0.921) (0.887)

i = sp

0.392 z = 4:190

AR(1) 0.706 (0.042) ˆ2 = 0:456

mfe s mse

0:018 (0.645) (0.417)

−0:003 (0.970) (0.941)

−0:005 (0.965) (0.932)

ARFIMA(1; d; 0)

0.5, 0.4, 0.0

0.7, 0.4, 0.0

1-step

3-step

5-step

Table 3 The percentile relative increase in the mean forecast error and the rate s1 (k)=s2 (k) Model

1-Step

3-Step

5-Step

(0:2; 0:2; 0:0)

8:9% (1.00)

13:8% (0.980)

25:3% (0.982)

(0:5; 0:2; 2:0)

37:0% (1.00)

30:0% (1.390)

42:6% (0.990)

(0:7; 0:2; 0:0)

24:7% (0.999)

49:9% (0.987)

108:4% (0.990)

(0:2; 0:3; 0:0)

15:4% (0.998)

16:7% (1.00)

19:7% (0.986)

(0:5; 0:3; 0:0)

19:8% (0.978)

24:93% (0.947)

33:91% (0.928)

(0:7; 0:3; 0:0)

137:83% (0.981)

134:9% (0.974)

770:7% (0.946)

(0:2; 0:4; 0:0)

141:0% (0.980)

104:3% (0.960)

566:6% (0.950)

(0:5; 0:4; 0:0)

118:7% (0.994)

178:5% (0.977)

1420:0% (0.955)

(0:7; 0:4; 0:0)

255:5% (0.970)

4333:0% (0.960)

3840:0% (0.950)

the sample variance P300−k s2 (k) =

j=260

2 (ej (k) − e(k)) 

(40 − k)

and the sample mean-square error P300−k 2 j=260 (ej (k)) : mse(k) = (40 − k)

282

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

The values of e(k)  are usually much smaller for the ARFIMA model. The quantities s(k) and mse(k) have similar values for both models. They increase when k also increases. Table 3 summarises the forecast results presented in Table 2 by calculating the percentile relative increase in mean forecast error which is obtained by (e 1 (k) − e 2 (k)) ; e 2 (k) where e i (k) (i = 1; 2) are the absolute values of the mean forecast error for ARMA and ARFIMA models, respectively. Table 3 also gives the rate (s1 (k))=(s2 (k)) (the values are between ‘( )’). We notice that the percentile relative increase in mfe gradually increases showing that the ARFIMA model presents better performance. We also evaluate the impact of the truncation point j, used in Eq. (3.8), to the forecast results. They were chosen by ÿxing ˆj smaller than 0:01; 0:001, and 0.0001. We noticed that the value of j varies according to the coecient values of the model and by increasing j the forecast gets better. Hence, we ÿxed j such that ˆj ¡ 0:0001. In this situation almost all observations of the series are included in the forecast procedure.

4. An example: the wind speed data The wind speed data (wsd) was obtained every 5 min by recording from 00 : 00 to 23 : 55 at the SILSOE Research Institute on 17=05=91 and is in units of miles per second. This series has 288 observations and its values can be found in Reisen (1993). We let Xi denote the ith wind speed value recorded (i = 1; 2; : : : ; 288). The sample mean is 0.8 with sample variance 0.122. The plot of wsd and its sample autocorrelation function are shown in Fig. 3. The series seems to be stationary and the sample autocorrelation function suggests that the data may belong to a class of long-memory models. To ÿt an ARFIMA(p; d; q) model to the data we basically follow the same idea as discussed in Section 3. (1) Estimate d using the methods of periodogram regression and smoothed periodogram regression. We again consider g(n) = n0:5 and m = n0:9 . (2) Test H0 : d = 0; H1 : d 6= 0. (3) Identify and estimate the parameters of the resulting ARMA(p; q) model. (4) Check the residuals. Step 1: The smoothed periodogram regression method gives dˆsp =0:299 with sample variance s2 = 0:008 and var(dˆsp ) = 0:004. (b) The periodogram regression method gives dˆp = 0:289 with variance s2 = 0:044 and var(dˆp ) = 0:038. Step 2: The test H0 : d = 0 gave z = 4:72 for the smoothed periodogram and z = 1:48 for the periodogram estimator. The approximate 95% conÿdence interval for d are (0:175; 0:423) and (−0:09; 0:67), respectively. Step 3: Let Uˆ t = (1 − B)0:299 (Xt − 0:8). Then based on the assumption that the {Xt } is ARFIMA(p; d;ˆ q); {Uˆ t } is an approximate ARMA(p; q) process. Using the AIC criterion the model chosen is ARIMA(1; d;ˆ 1): (1 − 0:81B)(1 − B)0:299 Xt∗ = (1 − 0:55B)t

with ˆ2 = 0:044:

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Fig. 3.

283

284

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Fig. 4. (a) Time plot of standardised residuals (conÿdence interval (95%)) (b) residual autocorrelations (95%) (c) histogram of the standardised residuals (d) plot of the normal scores (C2) versus the standardised residual (C1).

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

285

Fig. 4. (continued).

Step 4: Diagnostic checking of the above model. All the residual analyses were performed and the conclusion is that the residuals of the ÿtted ARFIMA(1; d;ˆ 1) are from a white-noise process. The sample mean is −0:014 with variance 0.0438. The residual analyses were made by using the plots of estimated standardised residuals and the sample autocorrelation function. Both ÿgures show a conÿdence band of 95%. Also, the normality was checked by computing the histogram and plotting normal scores versus standardised residuals. The plots are also shown graphically in Fig. 4. The run test gave the value of the probability equal to 0.73. Then, the residuals pass this test of independence. As we observed earlier, the test of the periodogram regression suggests that d = 0, i.e., the data may be modelled by a short-memory model. Based on the AIC criterion the ARMA(1; 1) model was chosen. The resulting ÿtted short-memory model to the wsd is (1 − 0:89B)(Xt − 0:8) = (1 − 0:35)t with ˆ2 = 0:044.

286

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

Table 4 Results of the mean-error forecast and the mean-square error of k-step ahead forecasts k

1

2

3

4

5

8

ARFIMA(1; d;ˆ 1)

e(k)  mse (k)

−0:069 0.040

−0:120 0.072

−0:150 0.100

−0:176 0.136

−0:201 0.170

−0:236 0.220

ARMA(1; 1)

e(k)  mse (k)

−0:096 0.045

−0:156 0.081

−0:204 0.116

−0:236 0.156

−0:269 0.195

−0:318 0.255

4.1. Forecasting the process In order to compare both models we present the analysis of forecasting the wsd. The forecasts were performed for values X261 ; X262 ; : : : ; X288 . The ÿrst 260 values of the wsd, i.e., X1 ; X2 ; : : : ; X260 , were used to ÿnd the estimated parameters of the ARFIMA(1; d; 1) and ARMA(1; 1) models. The estimates of the parameter in both models have similar values to those obtained before. The estimated value of d is dˆsp = 0:3 with variance 0.008 and the sample average of the series is 0.84. The ÿtted models are: (1 − 0:79)(1 − B)0:3 (Xt − 0:84) = (1 − 0:58)t ;

ˆ2 = 0:0443

and (1 − 0:86B)(Xt − 0:84) = (1 − 0:37)t ;

ˆ2 = 0:044:

Table 4 shows results of the mean values of error forecast e(k)  and the mean square ˆ error forecasts (mse(k)) for k = 1; 2; : : : ; 8. The ARFIMA(1; d; 1) model shows smaller values of e(k)  and mse(k). The mean square error of the ARMA model is greater than 12% and it increases with k, e.g., k = 5; 8 (14.7% and 15.9%, respectively). This analysis may suggest the use of the ARFIMA(1; d;ˆ 1) model to ÿt the wsd. 5. Conclusion The method of the minimum mean-square-error forecasts which is widely used for a non-stationary ARIMA(p; d; q) model was applied here in the case of d fractional. We have demonstrated how the methods of forecasting can be applied to forecast the future values of a long-memory process. The accuracy of one- and two-step ahead forecasts was described graphically for two simulated examples from a long-memory ARFIMA(p; d; q) model. In order to estimate the fractional parameter d, we applied the smoothed and the periodogram estimators. They were also used to test whether or not the process has long-memory properties. Finally, we applied the techniques of a long-memory process to real data, i.e., to the wind speed data. Results for forecasting ARFIMA processes are reported by Reisen and Abraham (1998), Peiris and Singh (1996) and some of these results are in accordance with those presented in this paper. Ray (1993), Crato and Ray (1996) have also discussed the forecasting of long-memory models. In general, the use of an ARFIMA(p; d; q)

V.A. Reisen, S. Lopes / Journal of Statistical Planning and Inference 80 (1999) 269–287

287

process for forecasting a long-memory process must be further investigated in the sense of verifying if it is worth checking for fractional di erencing and also adopting the proper model when forecasting. Acknowledgements We like to thank the referees for their valuable suggestions and the undergraduate students B. Zamprogno and E. Cerqueira for helping with some computations. V.A. Reisen was supported by CNPq, Brazil and S. Lopes by Pronex Fenˆomenos CrÃticos em Probabilidade e Processos EstocÃasticos (FINEP=MCT=CNPq=41=96=0923). References Anh, V.V., Kavalieiris, L., 1994. Long range dependence in models for air quality. In: Fletcher, D.J., Manly, B. (Eds.), Statistics in Ecology and Enviromental Modelling. University of Otago Press, Dunedin, pp. 199 –209. Akaike, H., 1973. Maximum likelihood identiÿcation of Gaussian autoregressive moving average models. Biometrika 60 (2) 255 –265. Beran, J., 1994. Statistics for Long Memory Process. Chapman & Hall, New York. Chen, G., Bovas, A., Peiris, S., 1994. Lag window estimation of the degree of di erencing in fractionally integrated time series models. J. Time Ser. Anal. 15 (5), 473–487. Crato, N., Ray, B., 1996. Model selection and forecasting for long-range dependent processes. J. Forecasting 15, 107–125. Geweke, J., Porter-Hudak, S., 1983. The estimation and application of Long memory time series model. J. Time Ser. Anal. 4 (4), 221–238. Hassler, U., 1993. Regression of spectral estimator with fractionally integrated time series. J. Time Ser. Anal. 14, 369 –380. Hosking, J., 1981. Fractional di erencing. Biometrika 68 (1), 165–176. Hosking, J., 1984. Modeling persistence in hydrological time series using fractional di erencing. Water Resources Res. 20 (12), 1898–1908. McLeod, A.I., Hipel, W.K., 1978. Preservation of the rescaled adjusted range 1. A reassessment of the Hurst phenomenon. Water Resources Res. 14 (3), 491–508. Peiris, S., Singh, N., 1996. Predictors for seasonal nonseasonal fractionally Integrated ARIMA models. Biometrics J. 38 (6), 741–752. Ray, B., 1993. Modelling long memory process for optimal long-range predictor. J. Time Ser. Anal. 14 (5), 511–525. Reisen, V.A., 1993. Long memory time series models. Thesis Ph.D. Department of Mathematics, UMIST, Manchester, U.K. Reisen, V.A., 1994. Estimation of the fractional di erence parameter in the ARIMA(p; d; q) model using the smoothed periodogram. J. Time Ser. Anal. 15 (3), 335–350. Reisen, V.A., Abraham, B., Lopes, S., 1998. Estimation of the parameters in a long memory time series model: A simulation study. Working paper 98-01, Department of Statistics and Acturial Science, University of Waterloo, CA. Reisen, V.A., Abraham, B., 1998. Prediction of long memory time series models: a simulation study and an application. Working paper RR 98=04, IIPQ, University of Waterloo, CA. Smith, J., Taylor, N., Yadav, S., 1997. Comparing the bias and misspeciÿcation in ARFIMA models. J. Time Ser. Anal. 18 (5), 507–527. Taqqu, M.S., Teverousky, V., Willinger, W., 1995. Estimators for long-range dependence: an empirical study. Fractals 3 (4), 785–788. Wei, W., 1990. Time Series Analysis, Univariate and Multivariate Methods, Addison-Wesley, Reading, MA.