Volume 246, number 3,4
PHYSICS LETTERS B
30 August 1990
Statistical analysis of the time dependence of the solar neutrino capture rate B.W. F i l i p p o n e W.K. Kellogg Radiation Laboratory, California Institute of Teehnology, Pasadena, CA 91125, USA and P. Vogel Department of Physics, California Institute of Technology, Pasadena, CA 91125, USA Received 25 May 1990
A maximum likelihoodtechnique is used to analyzethe measured solar neutrino capture rate in order to test various hypotheses regarding its time dependence. A capture rate anticorrelated with sunspot numbers gives a better fit to the data than a constant capture rate, but the goodnessoffitfor the two hypotheses is not significantlydifferent. We also find that the statistical significance of the results are dependent on the confidencelevels assumed for the very low capture rate runs.
The study of solar neutrinos can provide important clues to the physics in the central region of the Sun and possibly to the properties and interaction of neutrinos. The solar neutrino flux has been inferred from the neutrino capture rate in 37C1, measured over the past two decades in 83 separate runs [ 1,2 ]. (Since 1987 an i n d e p e n d e n t measurement of the E v > 8  9 MeV part of the spectrum has been made by KamiokandeII [ 3 ] . ) The 37C1 detector with its threshold of 0.81 MeV is primarily sensitive to the high energy, low flux part of the neutrino spectrum, principally the neutrinos from SB decay. The observed capture rate is, on average, several times smaller than the calculated one [4] (the "solar neutrino puzzle"). This discrepancy has led to many suggestions involving either n o n s t a n d a r d behaviour of the Sun, or effects associated with neutrino mass and mixing (see the recent review by Davis et al. [ 5 ] for a description of the experiment and references). Here we are concerned not with the discrepancy between the average rate and the predicted rate, but with a possible time dependence of the measured rate. This question, by itself, has a long history. Suggestions that the measured solar neutrino flux is anti546
correlated with solar activity, reflected by the n u m ber of sunspots, were made early on, in particular when the 37C1capture rate went through an apparent m i n i m u m around 1980 (coinciding with the time when sunspot Cycle 21 reached its m a x i m u m ) . The interest in this effect has been strengthened by Voloshin, Vysotskii and Okun [6 ], who proposed a physical scenario for this time dependence involving a relatively large static or transition neutrino magnetic m o m e n t precessing in the magnetic field of the solar convective zone. Among the many papers devoted to the phenomenological analysis of the time dependence of the 37C1 capture data, Bahcall, Field and Press [7] showed that conclusions about correlations with sunspot n u m b e r change significantly when one uses either the "upper" errors (upper error bar determining statistical weight) or the "average" errors (average of upper and lower error bars determ i n i n g the statistical weight) of individual runs. In that work, which was based on data up to 1985, the hypothesis of a constant capture rate was marginally rejected for average errors, but was not rejected at all for upper errors. Clearly, to reach an unbiased conclusion one has to choose an appropriate method of statistical analysis.
03702693/90/$ 03.50 © 1990  ElsevierSciencePublishers B.V. ( NorthHolland )
Volume 246, number 3,4
PHYSICS LETTERS B
Below we outline our statistical treatment, which differs from the Z 2 minimization of ref. [ 7 ]. Next we show our results of testing the hypotheses that the 37C1 capture rate is constant in time, or that it is correlated with sunspot number. We also briefly examine other possible periodic functions. Finally, following ref. [ 7 ] we examine what happens when the time sequence of the experimental runs is randomly scrambled. Because the event rate in the 37C1experiment is extremely low in each run, we have chosen the maxim u m likelihood technique to analyze the time dependence. This technique is superior to the conventional Z 2 analysis when the number o f events is small and the uncertainties are far from gaussian. In order to apply this method the probability of obtaining the observed 37Ar production rate for a given hypothesis must be calculated for each run. The likelihood function (5 ° ) is then the product of these probabilities over the entire data set. The unknown parameters are determined by maximizing 50 or more c o m m o n l y In 50. To calculate the probabilities for each run we use the Poisson distribution, where the probability o f observing n events if the expected (or mean ) n u m b e r is /t is given by P~ (~t) =
/z %xp (  / z ) n!
( 1)
To apply this distribution we use the observed production rate and its associated uncertainty to determine an effective number of observed counts n. Our test hypothesis then provides the value of/z. We assume that the observed production rate for run i (r,) is proportional to the effective number of observed counts, while the uncertainty (dry) is proportional to the square root of the effective n u m b e r o f observed counts (n,), as is expected for a Poisson process. In addition we assume that the expected production rate (p) is proportional to the expected number of counts. Thus we have
r,=otini,
dri=oLix~i,
P=o~ilt,.
(2)
This effective number of counts n, will be equal to the observed number of counts in the limit of zero background in the small counters used to identify the 37Ar.37C1 decay. For finite counter background, nt
30 August 1990
will in general be smaller than the number o f counts extracted from the likelihood analysis [ 8 ] (which also determines the counter background). This procedure thus allows the effects of counter background to be approximately included as reflected in the larger fractional error ( 1/V/~i ). We can eliminate the proportionality constant c~ (which depends on the exposure times, counting times, efficiencies, etc. ) by taking ratios, and thereby determine n~ and/~:
rli= (ri/dri) 2, /li=rli(ff/ri) .
(3)
In both o f the above equations we round off ni to the nearest integer in order to assure normalization of the probability. The case of asymmetric error bars (where the lower error limit is equal to zero) must be treated separately. For this case, we set hi= 1 and use the above procedure if (r~/dr u)2>~0.5 where dry is the magnitude of the upper error bar. If (r~/drp)z< 0.5, we set n~=0 and consider the upper error bar as indicating a 68% confidence level upper limit [2], and calculate/z~ from /z~= 1.14 [/~/(r, + d r y ) ] .
(4)
The factor 1.14 corresponds to a 68% confidence level upper limit estimate for the mean of a Poisson process where zero events are observed. We will also investigate later the sensitivity of our results to other assumptions for the confidence level upper limit for these runs. With the probabilities calculated as described above, the likelihood function can be evaluated under different hypotheses for p (i.e., constant production rate, production rate anticorrelated with sunspot number, etc.). The fitting parameters that describe p are determined in the usual manner by minimizing  I n 50. The minimization can be done analytically if p = constant, but requires a numerical minimization with more complicated hypotheses for p. The one standarddeviation uncertainties in the parameters are determined [9] by evaluating contours in the parameter space that decrease In 50 by 0.5. The first hypothesis to test is a constant production rate, p = a = constant. By minimizing  I n Lf over all 83 runs with this hypothesis, we obtain a = 0 . 5 5 + 0.034 atoms/d. This value is shown in fig. 1 along 547
Volume 246, number 3,4
PHYSICS LETTERS B
0.4
probability that a set of r a n d o m data with constant p r o d u c t i o n rate would yield a value larger than 111. This probability is near the conventional level for rejection of a hypothesis (usually ( 1  5 ) % is chosen), but is not definitively there. This same procedure can now be applied to the assumption that the production rate is correlated with solar activity or sunspot number. F o r this hypothesis we assume
0.2
p(t) = a + b { [nss(t)  r~s=]/ri~, },
1.6 1.4 ¸
1.2 1.0 o
30 August 1990
0.8
0.6 o
0.0 70
75
80 Calendar
85
90
Year
Fig. 1. Measured 37Arproduction rate versus time, showing the 83 runs used in the analysis. The dashed line at 0.55 is the best fit to a constant production rate. The full curve shows the best fit correlated with sunspots, eq. (6). with the measured production rates (it is i m p o r t a n t to note that in order to d e t e r m i n e the capture rate in Solar N e u t r i n o Units ( S N U ) from the measured production rate, a nonneutrino induced background of 0.08 a t o m s / d must first be subtracted). The fitted value is somewhat higher than the unweighted average of all the data which yields 0.49 a t o m s / d . We note that the usual Z 2 analysis yields a weighted average which is either 0.40 a t o m s / d for upper errors or 0.28 a t o m s / d for average errors. Having d e t e r m i n e d the parameters o f the best fit, we address the more i m p o r t a n t issue o f the goodnessoffit. In general, this is a difficult quantity to access for the m a x i m u m likelihood technique, unless there are a large n u m b e r of data points (e.g. 83 for the 37C1 data), in which case the likelihood ratio test [ 10] can be applied. F o r this test one determines ;t=s(p)/2~(O)
,
(5)
where 5°(0) is the likelihood function for the hypothesis being tested, and 5P(0 ) is the likelihood function for a hypothesis that exactly describes the data. For our case this hypothesis corresponds to setting p = ri at every data point. In the limit o f a large number o f data points it can be shown [ 10] that  2 × In 2 is distributed like a Z 2 distribution, thus allowing goodnessoffit levels to be determined. For the hypothesis of a constant production rate,  2 In 2 = 111 for 82 degrees for freedom. There is a 1.8% 548
(6)
where a and b are the p a r a m e t e r s to be determined, n==(t) is the observed sunspot n u m b e r at time t (we use the RI mean monthly sunspot numbers [11 ], which range from ~ 0  2 0 0 ) , and rT~sis the average sunspot n u m b e r over the period of interest. Introducing the constant term (rTs==74.0) is useful in eliminating correlations between a and b, thus simplifying the analysis o f the uncertainty in the fitted parameters. The m i n i m i z a t i o n o f  In C~gives for the two parameters: a = 0 . 5 5 + 0 . 0 3 5 a t o m s / d , and b =  0 . 1 6 _+0.05 a t o m s / d . This parameterization is shown in fig. 1 as a solid line. The sign o f b clearly suggests anticorrelation with sunspot number. Evaluation of  2 In 2 gives 102 with 81 degrees o f freedom, corresponding to a probability o f 5.7% that a r a n d o m sample of data resulting from this hypothesis would give a value larger than 102. Thus we see that even this hypothesis has a fairly small probability, but is acceptable at the conventional level (i.e. > 5%). Despite the relatively low probability of the sunspot correlation hypothesis (eq. (6) ), the p a r a m e t e r b is well d e t e r m i n e d ( ~ 3a away from zero). In addition, the fit restricts the variations in the capture rate between solar m i n i m a and m a x i m a to approximately a factor o f two. Thus, even during the quiet Sun, the capture rate of < 3.6 S N U (at the l a level) is only about half of the expected rate [ 4 ]. Regarding the time sequence o f the data, we note that after the first 40 runs, b is d e t e r m i n e d to be nonzero at the 2a level ( b =  0 . 1 7 _ + 0 . 0 8 ) , and that during the last decade b has remained negative and within the present error bars. In order to explore the sensitivity o f our results to the assumption o f the 68% confidence upper limit for the runs with (r,/dr~')2<0.5, we repeated the fits assuming (somewhat arbitrarily) that the upper error bars correspond to only a 50% confidence level upper
Volume 246, n u m b e r 3,4
P H Y S I C S LETTERS B
limit. The fitted parameters a and b and their uncertainties changed only by about 10%. However, the conclusions about the goodnessoffit changed significantly. In this case the two hypotheses give  2 × 1 n 2 = 8 6 (for constant flux) and 81 (for sunspot correlations), indicating that with this assumption both hypotheses have a high probability of describing the actual data. Thus, as already pointed out in ref. [ 7 ], the size of the error bars for the runs with a very small number of 37Ar decays is crucial for determining the time dependence of the solar neutrino flux. As an independent (albeit somewhat biased) test for the significance of the correlation of production rates with solar activity, we have performed fits for an arbitrary harmonic time dependence to determine if the 11 year cycle is the most favored. For this calculation we assume (7)
#( t ) = a + b cos( 27r( t  t o ) / T) ,
where T is the oscillation period (varied between 1 and 15 yr), and to is the phase which was constrained to produce a m i n i m u m near 1980 +_2. This last constraint was implemented to improve convergence. While this hypothesis did produce a fair fit with  2 × l n 2 = 103 for T = 10.7 yr, the best fit was obtained for T = 4.7 yr with  2 In 2 = 97.6. For comparison this fit is shown in fig. 2, along with the 37C1 data. Even this "best fit" has only a 8.3% chance of yielding a larger value than 97.6 for 79 degrees of freedom. The apparent discrepancy between the fit in fig. 1, which has a m a x i m u m in 19851986 and the T = 4 . 7 yr fit
30 August 1990
(fig. 2), which has a m i n i m u m during that time is likely related to the interruption in data taking between 1985.3 and 1986.7. As a check on our overall analysis procedure we have analyzed a set of Monte Carlo pseudodata [ 8 ] where the parameters of the actual experiment are used to generate data sets with a variety of input production rates. These tests showed that the extracted production rates were consistent with those input at the ( 510)% level. These studies also showed that the distribution o f  2 In 2 was in reasonable agreement with a Z 2 distribution as anticipated. We have not attempted to examine the suggested [ 6 ] T = 0.5 yr variation associated with the changing angle between the solar axis and the SunEarth line of sight. This period is on the order o f the typical exposure time of individual runs. Thus the assumption of a constant production rate over the exposure time (used in analyzing the individual runs [ 8 ] ) is not valid, requiring a modification of the basic procedure used in extracting the capture rate from the data. To study the uniqueness of the experimental time sequence we have also examined fits (following ref. [7] ) where the time ordering of runs is randomly scrambled. In fig. 3 we show the frequency histogram of  2 In 2 for 1000 random permutations. For each we minimize  l n ~ ( f i ) with fi determined from eq. (6). As expected, most of the fits are near  2 × In 2 = 111, the value for a constant production rate. Only 0.7% of fits yield a  2 ln2 better than that of
350
i
i
I
1.6
O

1.4
300
1.2
250 
1.0
200
0.8
~150
0.6
i

100 
0.4 0.2 o
0.0 70
75
80 Calendar
85
90
Year
Fig. 2. Measured 37Ar production rate versus time. The curve is the best fit to eq. (7) with T = 4 . 7 yr.
95
160 2
165 lnX
110'
115
Fig. 3. Result of 1000 r a n d o m permutations of the time ordering. The inset shows, in expanded scale, the bestfit permutations; the arrow indicates the true time sequence (see text for details ).
549
Volume 246, number 3,4
PHYSICS LETTERS B
the true time sequence. This result shows qualitatively that the observed pattern is rarely reached accidentally, but its quantitative significance is difficult to assess. In conclusion we have shown that in our unbiased analysis, the hypothesis o f a t i m e  i n d e p e n d e n t 37C1 neutrino capture rate is marginally rejected, having only 2% probability. However, it is disturbing that we are not able to find a simple hypothesis o f time variation that would describe the d a t a well. A capture rate anticorrelated with sunspot number, although more probable than the constant rate hypothesis, has a probability o f only 6%. One possible explanation o f these results is simply the p o o r statistics of the 37C1experiment. If the runs with a small n u m b e r of detected 37Ar decays ( < 1 ) are not reliably d e t e r m i n e d by the usual d a t a analysis procedure (as suggested in ref. [ 8 ] ), the error bars for these runs (or for some fraction o f t h e m ) could be underestimated, causing our goodnessoffit probabilities to be revised upwards. We look forward to an i m p r o v e d statistics d e t e r m i n a t i o n of the solar neutrino flux, either for the 37C1detector [2], or using alternative schemes in o r d e r to decide between these two competing hypotheses. We would like thank Bruce Cleveland for providing the new data and Monte Carlo simulations, and
550
30 August 1990
for discussions o f the statistical analysis used in the 37Cl experiment. We also thank H a r o l d Zirin for discussions regarding the sunspot counts. This work was s u p p o r t e d in part by the National Science F o u n d a tion under Contract No. PHY8817296, and by the US D e p a r t m e n t of Energy DEF60388ER40397. B.W.F. also acknowledges support from the Alfred P. Sloan F o u n d a t i o n .
References [1 ] J.K. Rowley, B.T. Cleveland and R. Davis Jr., in: Solar neutrinos and neutrino astronomy, eds. M.L. Cherry, W.A. Fowler and K. Lande (AIP, New York, 1985). [ 2 ] B.T. Cleveland, private communication. [3] K.S. Hirata et al., Phys. Rev. Lett. 63 (1989) 16. [4] J.N. Bahcall and R.K. Ulrich, Rev. Mod. Phys. 60 (1988) 297. [5] R. Davis Jr., A.K. Mann and L. Wolfenstein, Annu. Rev. Nucl. Part. Sci. 39 (1989) 467. [6] M.B. Voloshin, M.I. Vysotskii and L.B. Okun, Sov. Phys. JETP 64 (1986) 446. [7] J.N. Bahcall, G.B. Field and W.H. Press, Astrophys. J. 320 (1987) L69. [ 8 ] B.T. Cleveland, Nucl. Instrum. Methods 214 ( 1983 ) 451. [9] Particle Data Group, G.P. Yost et al., Review of particle properties, Phys. Lett. B 204 ( 1988 ) 1. [10]W.T. Eadie, D. Dryard, F.E. James, M. Roos and B. Sadoulet, Statistical methods in experimental physics (NorthHolland, Amsterdam, 1971 ). [ 11 ] SolarGeophysical Data, No. 542 (September 1989 ) p. 30; SECF PRF 751 (January 23, 1990) (NOAA, Boulder, CO).