A comparative analysis of the predictive power of implied volatility indices and GARCH forecasted volatility

A comparative analysis of the predictive power of implied volatility indices and GARCH forecasted volatility

Physica A 424 (2015) 105–112 Contents lists available at ScienceDirect Physica A journal homepage: www.elsevier.com/locate/physa A comparative anal...

356KB Sizes 0 Downloads 130 Views

Physica A 424 (2015) 105–112

Contents lists available at ScienceDirect

Physica A journal homepage: www.elsevier.com/locate/physa

A comparative analysis of the predictive power of implied volatility indices and GARCH forecasted volatility Sónia R. Bentes ISCAL, Av. Miguel Bombarda 20, 1069-035 Lisbon, Portugal BRU-IUL, Av. das Forças Armadas, 1649-025, Lisbon, Portugal

highlights • • • • •

Time-dependent variance or conditional heteroskedasticity is a source of inefficiency. Time-dependent variance can be controlled by a GARCH volatility specification. Comparing GARCH and Implied volatility forecasts is a robust procedure. Out-of-sample (OOS) encompassing tests are adequate to predict forecasting accuracy. OOS GARCH volatility forecasts outperform OOS Implied volatility forecasts.



Article history: Received 18 November 2014 Available online 12 January 2015 Keywords: Implied volatility GARCH forecasted volatility Inefficiency Out-of-sample forecasting accuracy

abstract This paper examines the accuracy of implied volatility and GARCH forecasted volatility to predict the behavior of realized volatility. The methodology adopted addresses the information content, the bias, the efficiency and the efficiency forecast of the predictor. In previous studies on this topic, efficiency has been analyzed both in terms of the efficiency of the predictor itself and its forecasting efficiency. In this context, implied volatility is the predictor and the efficiency is assessed through the validation of some of the OLS (Ordinary Least Squares) assumptions. However, those studies paid little attention to the heteroskedasticity of the residuals, even though this is an important source of inefficiency. Our study accounts for conditional heteroskedasticity by using a GARCH model to predict the time-dependent variance of the residuals. A GARCH forecasted volatility index was constructed based on these estimates. In addition, we employ out-of-sample forecasting accuracy tests in order to identify the best forecasting model. The results clearly show that GARCH forecasted volatility outperforms implied volatility to produce out-of-sample forecasts based on a subsample of the total sampling period for the four stock markets analyzed. © 2015 Elsevier B.V. All rights reserved.

1. Introduction Due to the frequent turbulence and complexity of stock markets, volatility has become critical to Financial Theory. Understanding price oscillations and anticipating their movements are, therefore, recurring research topics in Finance. Implied volatility (IV ) is a derived quantity in this field and is quite popular in explaining price oscillations. Originally computed from the Black–Scholes pricing model, IV is generally understood as the market’s expectation of future return volatility of the underlying asset. The rationale behind this is that if option markets are efficient then market implied

E-mail address: [email protected] http://dx.doi.org/10.1016/j.physa.2015.01.020 0378-4371/© 2015 Elsevier B.V. All rights reserved.


S.R. Bentes / Physica A 424 (2015) 105–112

volatility should be an efficient forecast of future volatility since it incorporates the information contained in all the variables in the market information set [1]. Widely regarded as the ‘‘investor fear gauge’’ because it is a barometer of the investor sentiment it also has an economic meaning: it is the price of a linear portfolio of options. In fact, the square of the VIX is the variance swap rate up to corrections because (i) there are only S&P 500 options at a finite number of strikes, and (ii) due to the possible presence of occasional breaks in the underlying asset. In other words, the square of the VIX is approximately equal to the risk-neutral expectation of the annualized return variance over the next 30 days [2]. Although some studies examine the predictive power of IV, no clear-cut answer has emerged as the evidence has so far been mixed. Thus, while some authors pointed to the apparent superiority of IV over the standard deviation in predicting future volatility, others concluded that IV is not a good predictor of realized volatility (RV ). The first stream of research includes Latane and Rendleman [3], Chiras and Manaster [4] and Beckers [5], who found that implieds outperform historical volatilities. Along the same line, similar conclusions were reached by Fleming et al. [6] for future market indices, Christensen and Prabhala [1] for stocks and Giot [7] for agricultural commodities. Subsequently, Szakmary et al. [8] using data from 35 future option markets concluded that the implieds outperform the historical volatilities as a predictor of RV for a large majority of the commodities studied. In addition, they also demonstrated that GARCH forecasts were not superior to those of IV s. However, this finding was not supported by Agnolucci [9] who studied the predictive power of IV and GARCH models for crude oil futures. According to his results, although IV does not perform better than GARCH models, it should not be overlooked since IV forecasts contain some information that is not found when using the GARCH-type models. In contrast with this stream of research, some authors, e.g. Day and Lewis [10] and Lamoureux and Lastrapes [11], found that implied volatility is biased and inefficient since past volatilities contain predictive information about future volatility beyond that provided by the implied. This is in accordance with Kumar and Shastri [12] and Randolph et al. [13] who concluded that IV has little power to predict RV. Furthermore, Canina and Fliglewsi [14] showed that there is no relation at all between implied and realized volatility. A number of reasons have been advanced for the unfavorable results of IV [9]: (i) sample selection bias due to the difficulty in observing IV during periods of high turbulence where market liquidity becomes a problem for investors [15]; (ii) sample bias, i.e., IV takes into account the presence of low probability events, which are not observed in the sample; (iii) bid–ask spreads; (iv) specification of the model due to the use of the Black–Scholes formula to obtain the IV of American options; and, finally, (v) the possible stochastic nature of volatility. Moreover, critics of IV further argue that it is not a good predictor of future realized volatility because market prices are determined by several other factors, such as, market liquidity, which are not considered in the Black–Scholes model. Given this controversy, further research is warranted. The purpose of this paper is therefore to present new evidence on the predictive power of IV, compare it with the GARCH derived volatility estimates and investigate which is the better predictor of ex-post volatility. Our research is also motivated by another shortcoming in the literature: studies on implied volatility focus mainly on developed economies, and only very few address emerging countries. In fact, there are a number of studies on the IV of futures, individual assets, stock market indices, oil and some other commodities traded in developed economies like the US and the EU [8], [16–18], while those on IV s of emerging markets are very scarce [19–21]. This may be due to the fact that data on IV s has only recently become available for these countries (see, for example, India and Korea). Moreover, the relation between IV and RV has typically been addressed by a classical static regression model, which can be useful but fails to capture dynamic and nonlinear relations. Thus, our paper contributes to literature since it: (i) updates earlier research on IV ; (ii) focuses on emerging markets; (iii) applies an alternative approach based on GARCH forecasted volatility (GV ) and (iv) evaluates the forecasting performance of IV and GV to assess the information content of implieds and GARCH forecasts in explaining realized volatilities, which to the best of out knowledge has not yet been done. Our results show that GARCH forecasted volatility outperforms implied volatility in forecasting out-of-sample realized volatility. However, implied volatility contains information that should not be neglected when trying to understand the distribution properties of realized volatility. The remainder of the paper is organized as follows: Section 2 presents the methodological background. In Section 3, we describe the data and the sampling procedure. Section 4 discusses the empirical results obtained in our study. Finally, Section 5 presents the main conclusions. 2. Methodological background Following the traditional approach, the information content of implied volatility is typically assessed by an OLS regression of the form: RV t = α0 + αi IV t + εt ,


where RV t denotes the realized volatility for the period t and IV t represents the implied volatility at the beginning of period t, α0 and αi are parameters estimated by OLS and εt is a random noise. According to Christensen and Prabhala [1], the following hypotheses can be tested from model (1): (H1 ) If IV contains at least some information about future realized volatility, αi should be nonzero; (H2 ) If IV is an unbiased estimate of realized volatility then α0 = 0 and αi = 1; finally,

S.R. Bentes / Physica A 424 (2015) 105–112


(H3 ) If implied volatility is efficient, the residuals εt should be white noise, non-auto correlated, and uncorrelated with any explanatory variable. In order to compare the efficiency of implied volatility with that of past realized volatility, the next step is to run a multiple regression of the form: RV t = α0 + αi IV t + αh RV t −1 + εt ,


where RV t −1 denotes the realized volatility at time t − 1 and αh is the corresponding parameter estimated by OLS. Consequently, a fourth hypothesis can be tested, namely: (H4) If αi IV t is an efficient forecast, αh should be non-statistically significant and the values of the R2 and the information criteria of Eq. (2) should not be markedly different from those of Eq. (1). In other words, past realized volatility should not have a significant impact on current realized volatility. The former literature on the relationship between implied and realized volatility relied mainly on regressions of the form (1) and (2), which favor contemporaneous relationships between variables. However, those studies paid little attention to the problem of time-varying error variance, i.e., to conditional error variance or conditional heteroskedasticity. This is particularly relevant as it is well known in the literature that the OLS estimator is also inefficient in the presence of residual heteroskedasticity [21]. In order to cope with this problem, we compute the GARCH forecasted volatility and compare it with IV s. The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model where ut = zt σt and zt is i.i.d., with zero mean and unit variance, was derived by Bollerslev [22] and can be written as:

σt2 = ω +

q 

λj u2t −j +

p 


θi σt2−i ,



where ω > 0, and λj and θi are the coefficients of u2t −j and σt2−i , respectively. Note that the lag j = 1, . . . , q and i = 1, . . . , p.



    θi and 1 − pi=1 θi are constrained to lie outside the unit circle. We shall use Eq. (3) to obtain the daily forecasts of the conditional variance σt2 . 

For stability and covariance stationarity of the noise term ut , all the roots of 1 −

j =1

λj −


Finally, we test the forecasting accuracy of IV and GARCH forecasted volatility to predict RV using an out-of-sample encompassing test suggested by Harvey et al. [23] (HLN). The null hypothesis in this test denotes that estimator x outperforms estimator y or vice-versa, where x and y are the IV and the GARCH estimated volatility, respectively. 3. Data and sampling procedure The data consist of the monthly closing prices of the implieds and their respective underlying daily indices from four stock markets, namely, Hong-Kong, India, Korea and the US. India and Korea are emerging markets, while the others are developed markets. Thus, the implied volatility indices encompass the VHSI (Hong-Kong), INVIXN (India), KIX (Korea) and VIX (US). The underlying indices comprise the Hang-Seng (Hong-Kong), S&P CNX NIFTY (India), KOSPI (Korea) and SPX (US), respectively. The data were retrieved from Bloomberg database and cover the period from October 2003 to July 2012, totaling 106 time observations per series. We calculate the returns of the underlying indices as the first difference of the natural logarithm of the closing prices. In addition, as volatility is a latent variable, a proxy needs to be computed so that comparative analyses may be conducted [9]. Following the common practice in the literature (e.g., [1,2]), the 30-day realized volatility is used as a proxy:

   2 22   260  Pt  . ln RV = 100 × 22

t =1

P t −1


In our study, we used non-overlapping observations to determine the 30-day volatility. Likewise, the daily q realized p 2 2 GARCH forecasted volatility was obtained by estimating the model σt2 = ω + j = 1 λ j ut − j + i=1 θi σt −i and from these estimates the monthly time series was constructed by1 :

  22  260  σt2 . GV = 100 ×  22


t =1

Then, the three monthly series RV, IV and GV are used to test H1 –H4 and also the out-of-sample forecasting accuracy of each model.

1 Recall that the monthly implied volatility was directly obtained from the Bloomberg database without further transformations.


S.R. Bentes / Physica A 424 (2015) 105–112

Table 1 Realized and implied volatility: OLS estimates for Eqs. (1) and (2). Country

Hong-Kong India Korea US

Eq. (1): RV t = α0 + αi IV t + εt

Eq. (2): RV t = α0 + αi IV t + αh RV t −1 + εt



χ 2 (1)






χ 2 (1)



−2.308 −0.340

0.970** 0.890** 0.798** 0.969**

0.163 0.678 5.427* 0.138

0.622** 0.453** 0.450** 0.568**

7.286 7.650 7.139 7.095

−4.004 −1.457

1.243** 1.044** 0.374 0.431*

−0.235 −0.129

1.153 0.020 8.808** 8.811**

0.628** 0.457** 0.475** 0.608**

7.324 7.737 7.147 7.049



4.231 0.287

0.390* 0.462**

Notes: N = 106. * Statistically significant at the 5% level. ** Statistically significant at the 1% level. Table 2 OLS Implied volatility residual’s diagnostics for Eqs. (1) and (2). Country

Eq. (1): RV t = α0 + αi IV t + εt DW

Hong-Kong India Korea US

1.889 2.012 1.464 1.197

χ 2 (2) 0.746 0.640 11.377** 20.246**

ρ -coeff −3.06E−16 6.71E−16 −4.22E−15 4.42E−15

Eq. (2): RV t = α0 + αi IV t + αh RV t −1 + εt ADF

J–B **

−9.646 −7.397** −7.745** −6.751**


2050.4 165.0** 1360.4** 1036.3**


χ 2 (2)

ρ -coeff



1.703 1.734 1.699 1.650

7.623* 0.505 7.274* 8.699*

−1.81E−16 −1.87E−17 6.09E−17 1.64E−15

−8.784** −7.722** −8.730** −8.504**

1848.4** 162.9** 1235.7** 676.2**

Notes: * Statistically significant at the 5% level. ** Statistically significant at the 1% level.

4. Results This section contains three subsections in which we present the estimation results from Eqs. (1) to (3) as well as the outof-sample analysis of forecasting accuracy. The first subsection discusses the implied volatility results; the second presents the GARCH forecasting volatility results and the third displays the results of testing for out-of-sample forecasting accuracy using the HLN encompassing procedure. 4.1. Implied volatility Table 1 presents the estimation results from Eqs. (1) and (2) as well as the tests for the relative predictive power of IV in forecasting RV. For each of the regressions and four countries, we report the estimated coefficients (α0 , αi and αh ), the corresponding significance level (1% or 5%), the R2 coefficient of determination and the Schwarz Information Criterion (SIC) value. For the null of αi = 1 we also report the χ 2 (1) statistics and the corresponding significance level. Our results show that the implieds contain at least some information on future realized volatility (H1 ) since αi ̸= 0 at the 1% level or better for all countries (Eq. (1)). Further, the estimated coefficients are all positive ranging from 0.798 (Korea) to 0.970 (Hong-Kong). In addition, for all countries considered in this analysis the intercept α0 is not significantly different from zero. On the other hand, the χ 2 (1) statistics for the null of αi = 1 is only rejected for Korea (the country with the lowest αi in the sample). This means that for Korea IV is a biased estimate of RV (H2 ), but this is not the case for the remaining countries. Thus, with the exception of Korea H1 and H2 are confirmed in our study. Table 2 shows the residual’s diagnostics of Eqs. (1) and (2), which include the DW statistics and the Breusch–Godfrey χ 2 (2) LM test for residual autocorrelation. It also includes the ρ -coefficient that measures the extent of correlation between the residuals and the predictor variable IV, the unit root ADF test for nonstationarity and the Jarque–Bera (J–B) test to assess whether the residuals are Gaussian. The evaluation of H3 implied volatility efficiency is based on these tests. The DW statistics in Eq. (1) is approximately equal to 2 for India and, to some extent, for Hong-Kong. In contrast, for Korea and the US, these statistics are much lower than 2. In order to confirm these results, we employ the χ 2 (2) LM test, which provides more accurate insights on residual’s autocorrelation. The null hypothesis of no residual autocorrelation is not rejected for Hong-Kong and India but it is rejected at the 1% level for Korea and the US. Therefore, the DW and χ 2 (2) LM tests point to similar conclusions. There is no evidence of correlation of εt with IV, given that all the ρ -coefficients obtained are not statistically different from zero. The other two tests presented in Table 2 indicate that, in all cases, the residuals are stationary (ADF) but non-Gaussian (J–B). There is mixed evidence of the efficiency of IV as a predictor of RV. On one hand, Hong-Kong and India satisfy the requirement for no residual’s autocorrelation, no correlation with the predictor and stationarity. These properties are vital under the OLS assumptions (zero mean, no residual’s autocorrelation and exogenous predictors). Gaussian residuals were not found in any country analyzed. On the other hand, Korea and the US fail in terms of the no residual’s autocorrelation property. Thus, although Hong-Kong and India satisfy most of the assumptions of H3 , there is no evidence that the residuals are white noise, or at least Gaussian white noise. For Korea and the US, there is no evidence at all that IV is efficient, given the presence of autocorrelation in the residuals of Eq. (1).

S.R. Bentes / Physica A 424 (2015) 105–112


In order to test the fourth hypothesis (H4 ) that IV is a better predictor of RV than past realized volatility RV t −1 (or, similarly, that αi IV t is an efficient forecast) we investigate whether αh in Eq. (2) is non-statistically significant. We also compare the estimates of R2 and SIC of regressions (1) and (2) in order to attest whether they are markedly different or not. As shown in Table 1, the null hypothesis of αh = 0 is not rejected for Hong-Kong and India, thus revealing that αh is not statistically different from zero for these countries. For these markets, our findings are consistent with the first proposition of H4 . However, this is not the case of Korea and the US since the corresponding estimates of αh are significantly different from zero at the 5% level or better. Regarding the values of R2 and SIC in Eqs. (1) and (2), we observe that those of Eq. (2) are actually not markedly different from those of Eq. (1) in absolute terms as they are all smaller than 0.1. In percentage terms, the changes in the SIC value are negligible. This is also the case for the R2 coefficient for Hong-Kong and India. However, for Korea and the US they lie around 6%–7%. Thus, our results appear to show that while H4 holds for Hong-Kong and India, for the remainder countries the rejection of the null of αh = 0 reveals that past realized volatility is significant in predicting RV and H4 no longer holds. To sum up, H1 is confirmed in all the four countries analyzed. H2 is confirmed in all countries except Korea (but with only a 5% rejection level for Korea). H3 is partially satisfied for Hong-Kong and India, but not for Korea and the US given the presence of residual’s autocorrelation. Although we did not find Gaussian white noise residuals in any case, they may follow another distribution with similar properties (zero mean and finite variance). Finally, H4 is also confirmed for Hong-Kong and India, but not for Korea and the US. Therefore, implied volatility for Hong-Kong and India appears to be unbiased and to some extent efficient for the underlying realized volatility. However, for Korea and the US, implied volatility is mostly unbiased but not efficient to predict realized volatility. Although relevant, the results obtained from estimation of Eqs. (1) and (2) do not account for another source of inefficiency. As noted above, the OLS estimator is inefficient in the presence of residual heteroskedasticity. In particular, conditional heteroskedasticity may be used to produce a time-dependent volatility index using a GARCH-type model, and its forecasting accuracy can be compared with that of the implied volatility. Note that the GARCH forecasted volatility is free of heteroskedasticity problems. 4.2. GARCH forecasted volatility This subsection discusses the results obtained from regressing the realized volatility (RV ) on the GARCH forecasted volatility (GV ). The monthly GARCH forecasted volatility was computed using Eq. (3). The conditional daily variance inserted in Eq. (5) was obtained from the forecasts of the GARCH model (Eq. (3)). Table 3 presents the variance parameter estimates of model (3) (ω, λ and θ ) as well as some relevant test statistics for all countries. These include a test for the adequacy of the t-distribution, the Log-Likelihood and SIC values, the DW statistics and a χ 2 (1) test (ARCH-Engle test) for conditional heteroskedasticity where the null postulates that there is no residual’s conditional heteroskedasticity after the model is estimated. Overall, the results shown in Table 3 are consistent with the need for controlling residual time-dependent heteroskedasticity in order to obtain efficient estimates of Eqs. (1) and (2). That is, in addition to H3 and H4 , conditional heteroskedasticity is also important to assess efficiency. To the best of our knowledge this was not duly accounted for in previous work on this subject. As can be seen, most of the estimated parameters (ω, λ and θ ) are significantly different from zero at the 1% level (only one is significant at 5%). The t-DOF test indicates that the t-distribution describes the behavior of the residuals better than the Gaussian distribution. This is in line with our previous findings that the residuals were not Gaussian using a J–B test for Normality. The DW statistics indicates that after controlling for conditional heteroskedasticity, the residuals are no longer autocorrelated. Finally, the χ 2 (1) ARCH test does not reject the null of no conditional heteroskedasticity in the residuals for any of the four countries analyzed. In addition to the test statistics presented in Table 3, we also performed a Wald test for the null that λ + θ = 1. The null was only rejected for India and at the 5% level (χ 2 (1) = 4.178 with p-value = 0.041). This seems to indicate that the conditional variance, and thus the volatility, is persistent (or nonstationary) in the remaining cases. However, a unit root ADF test performed on the GARCH residuals rejects the null of nonstationarity at the 1% level or better in all cases. Therefore, we used the forecasts produced by Eq. (3) in order to obtain the monthly GV index. Tables 4 and 5 present the estimates and diagnostic tests for Eqs. (1) and (2) where the predictor variable is now the GARCH forecasted volatility. That is, IV was replaced by GV in order to obtain the estimates reported in these tables. The goal in this subsection is to assess whether H1 to H4 hold when the implied volatility (IV ) predictor is replaced by the GARCH forecasted volatility (GV ). For H1 , we note that αg′ is significantly different from zero at the 1% level or better in all cases. However, for India and Korea we find that α0′ ̸= 0 and αg′ ̸= 1 (χ 2 (1) test). Therefore, H2 holds for Hong-Kong and the US but not for India and Korea. That is, GV is only an unbiased estimate of RV in the most developed markets analyzed here. However, the overall performance of the models has greatly improved when GV is used as the predictor, as can be seen by the R2 coefficients reported in Table 4 compared with those in Table 1. Regarding H3 , the test statistics reported in Table 5 for Eq. (1) indicate that the absence of residual’s autocorrelation (χ 2 (2) test) only occurs for Hong-Kong but no longer for India (contrary to the results reported in Table 2). The correlation


S.R. Bentes / Physica A 424 (2015) 105–112

Table 3 GARCH estimates: Eq. (3). Eq. (3): σt2 = ω +


Hong-Kong India Korea US



λj u2t −j +



θi σt2−i








χ 2 (1)

9.74E−07* 5.19E−06** 2.97E−06** 1.14E−06**

0.062** 0.110** 0.076** 0.087**

0.936** 0.873** 0.912** 0.908**

6.763** 6.663** 6.841** 5.640**

6833.8 6607.7 6755.6 7483.6

−5.895 −5.699 −5.827 −6.457

2.071 1.906 1.973 2.230

2.408 0.806 2.414 2.359

Notes: N = 2312; method: ML – ARCH (Marquardt) – Student’s t distribution. * Statistically significant at the 5% level. ** Statistically significant at the 1% level. Table 4 Realized and GARCH forecasted volatility: OLS estimates for Eqs. (1) and (2). Country

Eq. (1): RV t = α0′ + αg′ GVt + εt

αo′ Hong-Kong India Korea US

−0.862 −4.629** −4.027** −1.096


χ 2 (1) **

0.997 1.126** 1.137** 1.003**

0.004 5.121* 9.029** 0.006

Eq. (2): RV t = α0′ + αg′ GVt + αh′ RV t −1 + εt R2


SIC **

0.811 0.883** 0.856** 0.893**

6.590 6.105 5.799 5.700

αg′ **

−3.222 −4.589** −6.027** −1.485**

αh′ **

χ 2 (1) **

−1.108 −0.495** −0.665** −0.580**

2.155 1.601** 1.863** 1.568**

R2 **

SIC **

298.55 145.66** 223.92** 93.85**

0.957 0.968** 0.952** 0.949**

5.169 4.890 4.758 5.016

Notes: N = 106. * Statistically significant at the 5% level. ** Statistically significant at the 1% level. Table 5 OLS GARCH forecasted volatility residual’s diagnostics for Eqs. (1) and (2). Country

Eq. (1): RV t = α0′ + αg′ GVt + εt DW

Hong-Kong India Korea US

2.273 2.778 2.185 1.844

χ 2 (2 ) 3.857 14.363** 16.747** 7.694*

ρ -coeff −9.74E−15 −5.30E−15 −5.33E−15 4.63E−15

Eq. (2): RV t = α0′ + αg′ GVt + αh′ RV t −1 + εt ADF

J–B **

−11.89 −5.86** −10.99** −8.39**

DW **

937.71 9.89** 408.35** 701.62**

2.174 2.673 2.668 2.228

χ 2 (2) 2.305 7.731* 21.733** 2.374

ρ -coeff −4.41E−14 1.71E−14 −1.79E−14 −2.36E−16


J–B **

−11.11 −10.98** −10.94** −7.68**

70.93** 1.92 68.56** 1129.5**

Notes: * Statistically significant at the 5% level. ** Statistically significant at the 1% level. Table 6 HLN out-of-sample test statistics. Country

IV vs. GV F -std.

Hong-Kong India Korea US



221.88 93.18** 274.99** 456.76**

F -std. **

8.41 27.34** 13.56** 14.17**


16.66 2.43 10.27** 5.59*

HLN 2.70 3.04 1.19 1.84

Notes: H0 : ‘‘x outperforms y’’. * Statistically significant at the 5% level. ** Statistically significant at the 1% level.

coefficient (ρ ) between the residuals and the predictor GV is not significantly different from zero in any case. The unit root ADF tests indicate that the residuals are stationary in all cases. Finally, the rejection of the null in the J–B test for Normality indicates that the residuals are not Gaussian white noise. Finally, with respect to H4 , the null that αh′ = 0 is rejected at the 1% level or better in all cases. Consequently, there is a marked difference between the R2 and SIC values of Eq. (2) from those of Eq. (1). These results seem to indicate that GV is neither efficient nor an efficient forecast of RV. However, these conclusions are drawn from a battery of tests that focuses separately on partial features of the residual’s distribution function. These conclusions may be misleading and a more accurate test for the forecast ability of each predictor may be required. This will be seen in the next subsection. 4.3. Out-of-sample forecasting accuracy The out-of-sample forecasting will be achieved by setting the estimation sampling period to end in July 2010. This leaves a 24-month period outside of the estimation sample for the purpose of comparing the forecasting capability of the models.

S.R. Bentes / Physica A 424 (2015) 105–112


The Diebold–Mariano (DM) test statistics [24] has been widely used by researchers to assess the forecasting ability of competing models. However, there are some drawbacks to this test (notably, oversizing for small samples and longer horizon forecasts, inter alia) that recommend the use of an alternative method. To accommodate situations where more than one competing forecast is compared we use the forecast encompassing approach in our analysis, as proposed by Harvey et al. [23] (HLN) and later enhanced by Harvey and Newbold [25]. Table 6 displays the results of the HLN test statistics for each country. The results in the first two columns refer to the null that IV outperforms GV and, conversely, for the results shown in the last two columns. Rejection of the null indicates that the forecast errors are similar or that GV (IV ) may outperform IV (GV ) in forecasting the actual realized volatility. The reverse test that GV outperforms IV helps draw a final an unambiguous conclusion. Note that in our empirical work we use one-step ahead static forecasts; these are usually more accurate than dynamic forecasts since, for each step (period), the actual value of the realized volatility in the previous period is used to produce the forecast of the realized volatility in the current period. In Table 6 we report the F -standard statistics along with the HLN statistics for comparative purposes. However, the final conclusions should only be drawn from the HLN test statistic results. The results indicate that the null is rejected in all cases when we postulate that IV outperforms GV, and is not rejected in any case when the null denotes that GV outperforms IV. The conclusion is therefore that GARCH forecasted volatility produces better out-of-sample forecasts of the realized volatility than implied volatility. To sum up, our out-of-sample HLN results reveal that implied volatility may be helpful for investors’ decision making but it is not yet a clear substitute of the forecasts produced by alternative methods like the GARCH forecasted volatility. 5. Conclusions This paper analyzes the ability of implied and GARCH forecasted volatility to predict realized volatility in four stock markets. These include three Asian countries and the US as a benchmark. The implied volatility (IV ) and the GARCH forecasted volatility (GV ) are used separately as predictors in two model specifications. The methodological framework used in this paper allows us to make inferences about four main hypotheses on the relationship between the implied and the realized volatility: (i) implied volatility contains information about future realized volatility; (ii) implied volatility is an unbiased estimate of realized volatility (iii) implied volatility is efficient; and (iv) implied volatility is a more efficient forecast than the past realized volatility. The same hypotheses also apply to the relationship between the realized and the GARCH forecasted volatility. Overall, our results show that the first hypothesis holds in all cases for both the implied and GARCH forecasted volatility. The second hypothesis holds for Hong-Kong, India and the US when the predictor is the implied volatility but only holds for Hong-Kong and the US when the predictor is the GARCH forecasted volatility. The third hypothesis holds for Hong-Kong and India when implied volatility is the predictor but the residuals are not Gaussian white noise. However, when the GARCH forecasted volatility is the predictor the third hypothesis only holds for Hong-Kong. Finally, regarding the fourth hypothesis, our results show that implied volatility is more efficient than past realized volatility to forecast current realized volatility in Hong-Kong and India but not in Korea or the US. However, the conclusions are not applicable to the GARCH forecasted volatility model. This study also reports the results for out-of-sample forecasting accuracy using the HLN encompassing test. Our findings clearly show that GV outperforms IV in forecasting the actual realized volatility (RV ) between 2010 and 2012. To summarize, we found that GARCH forecasted volatility is a better predictor of realized volatility than implied volatility. This conclusion requires the use of a specific test of forecasting accuracy. Preliminary or partial results on model estimation and diagnostic testing make it difficult to draw clear conclusions on this topic. However, they provide important insights on the characteristics of the residual’s distribution. Our main contribution to the existing literature is that we not only use an alternative predictor (GV ), but we also compare its forecasting accuracy with an alternative one. To the best of our knowledge this has not yet been done in the context of IV, RV and GARCH forecasted volatility. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Christensen, N.R. Prabhala, The relation between implied and realized volatility, J. Financ. Econ. 50 (1998) 125–150. D.H. Hsu, B.M. Murray, On the volatility of volatility, Physica A 380 (2007) 366–376. H.A. Latane, J.R. Rendleman Jr., Standard deviations of stock price ratios implied in option prices, J. Finance 31 (1976) 369–381. D.P. Chiras, S. Manaster, The information content of option prices and a test of market efficiency, J. Financ. Econ. 6 (1978) 213–234. S. Beckers, Standard deviation implied in options prices as prdictors of future stock price variability, J. Bank. Finance 5 (1981) 363–381. J. Fleming, B. Ostdiek, R.E. Whaley, Predicting stock market volatility: a new measure, J. Futures Mark. 15 (1995) 265–302. P. Giot, The information content of implied volatility in agricultural commodity markets, J. Futures Mark. 23 (2003) 441–454. A. Szakmary, E. Ors, J.K. Kim, W.N. Davidson III, The predictive power of implied volatility: evidence from 35 futures markets, J. Bank. Finance 27 (2003) 2151–2175. P. Agnolucci, Volatility in crude oil futures: a comparison of the predictive ability of GARCH and implied volatility models, Energy Econ. 31 (2009) 316–321. T. Day, C. Lewis, Stock market volatility and the information content of stock index options, J. Econometrics 52 (1992) 267–287. C.G. Lamoureux, W. Lastrapes, Forecasting stock returns variance: towards understanding stochastic implied volatility, Rev. Financ. Stud. 6 (1993) 293–326. R. Kumar, K. Shastri, The predictive ability of stock prices implied in option premia, Adv. Futures Options Res. 4 (1990) 165–176.


S.R. Bentes / Physica A 424 (2015) 105–112

[13] W.L. Randolph, B.L. Rubin, E.M. Cross, The response of implied standard deviations to changing market conditions, Adv. Futures Options Res. 4 (1990) 265–280. [14] L. Canina, S. Fliglewsi, The informational content of implied volatility, Rev. Financ. Stud. 6 (1993) 659–681. [15] R.F. Engle, J. Rosenberg, Testing the volatility term structure using option hedging criteria, J. Deriv. 8 (2000) 10–28. [16] J.B. Blair, S.-H. Poon, S.J. Taylor, Forecasting S&P 100 volatility: the incremental information content of implied volatilities and high-frequency index returns, J. Econometrics 105 (2001) 5–26. [17] R. Becker, A.E. Clements, S.I. White, On the informational efficiency of S&P500 implied volatility, N. Am. J. Econ. Finance 17 (2006) 139–153. [18] E.-T. Chen, A. Clements, S&P 500 implied volatility and monetary policy announcements, Finance Res. Lett. 4 (2006) 227–232. [19] S.O. Nam, S.Y. Oh, H.K. Kim, B.C. Kim, An empirical analysis of price discovery and pricing bias in KOSPI 200 stock index derivatives markets, Int. Rev. Financ. Anal. 15 (2006) 398–414. [20] E.B. Vrught, US and Japanese macroeconomic news and stock market volatility in Asia-Pacific, Pacific-Basin Finance J. 17 (2009) 611–627. [21] S.R. Bentes, R. Menezes, On the predictability of realized volatility using feasible GLS, J. Asian Econ. 28 (2013) 58–66. [22] T. Bollerslev, Generalized autoregressive conditional heteroskedasticity, J. Econometrics 31 (1986) 307–327. [23] D.I. Harvey, S.J. Leybourne, P. Newbold, Tests for forecast encompassing, J. Bus. Econom. Statist. 16 (1998) 254–259. [24] F.X. Diebold, R.S. Mariano, Comparing predictive accuracy, J. Bus. Econom. Statist. 13 (1995) 253–263. [25] D.I. Harvey, P. Newbold, Tests for multiple forecast encompassing, J. Appl. Econometrics 15 (2000) 471–482.