AN INDEPENDENT ASSESSMENT OF SIGNIFICANCE OF ANNUAL MODULATION IN COSINE-100 DATA

We perform an independent search for annual modulation in the recently released COSINE-100 data, which could be induced by dark matter scatterings. We test the hypothesis that the data contains a sinusoidal modulation against the null hypothesis, that the data consists of only background. We compare the significance using frequentist method, information theoretic techniques (such as AIC and BIC), and finally a Bayesian model comparison technique. Both the frequentist and Bayesian techniques reveal no significant differences between the two hypotheses, whereas the null hypothesis is slightly favored according to AIC and BIC-based tests. This is the first proof of principles demonstration of application of Bayesian and information theory based techniques to COSINE-100 data to assess the significance of annual modulation.


INTRODUCTION
Although about 25% of the universe's matter density consists of cold dark matter (Planck Collaboration et al. 2018), we have no clue about the mass of the dark matter particle or its non-gravitational couplings (Jungman et al. 1996). The most theoretically favored and widely studied cold matter candidate is the Weakly Interacting Massive Particle (or WIMP) (Lee and Weinberg 1977). A large number of experiments have been taking data for more than 30 years to look for direct signatures of WIMP-nucleon interactions in underground laboratory-based experiments (Schumann 2019). Among these, only the DAMA/LIBRA experiment has detected an annual modulation, having all the right characteristics of been induced by WIMPs in our galaxy (Freese et al. 1988), with a statistical significance of about 12σ (Bernabei et al. 2018). However, the WIMP parameter space inferred from the DAMA/LIBRA results is ruled out by many other direct detection experiments. Although many attempts (for eg. Herrero-Garcia et al. 2012;Catena et al. 2016;Nobile et al. 2015;Herrero-Garcia et al. 2018;, and references therein) have been made to reconcile the results of DAMA with the null results of other experiments using non-standard particle physics or astrophysics assumptions, the jury is still out on whether any of them can satisfactorily reconcile with the latest results from all the direct detection experiments. The only possible resolution out of this conundrum could be that, no other direct detection experiment with null results used the same target material as DAMA, viz. thallium-doped NaI. The COSINE-100 experiment ) is one of the first experiments, whose detector is designed to be a replica of the DAMA target, and hence can confirm or refute their annual modulation claims in a model-independent fashion. Many other experiments, designed to do a similar test of the DAMA annual modulation such as DM-Ice17 (Barbosa de Souza et al. 2017), KIMS (Kim et al. 2019), SABRE (Antonello et al. 2019), and ANAIS-112 (Amaré et al. 2019) are also about to start taking data and the ANAIS experiment has released preliminary results.
In a recent work (Krishak et al. 2019), we did an independent assessment of the DAMA/LIBRA annual modulation claims from their most recent data release, using three disparate model comparison techniques: frequentist (Desai 2016), Bayesian (Trotta 2017;Kerscher and Weller 2019), and information theoretic techniques (Liddle 2004(Liddle , 2007. The Bayesian and information theoretical techniques are widely used for model comparison in Astrophysics and Cosmology, but rarely used in direct dark matter detection experiments. In this work, we apply the same techniques to the recently released data from the COSINE-100 experiment . The outline of this paper is as follows. A brief summary of the COSINE-100 results can be found in Sect. 2. Our own re-analysis is described in Sect. 3. We conclude in Sect. 4. We do not provide any details of the theory behind the different model comparison techniques used herein, which can be found in Krishak et al. (2019) and references therein. Our analysis codes and results can be found on a github link, whose url is provided in Sect. 4.

RECAP OF COSINE-100 RESULTS
We provide a brief recap of the main results in Adhikari et al. (2019) (CS100 hereafter), wherein more details can be found. The COSINE-100 experiment is located at the Yangyang underground laboratory in South Korea under more than 700 m of rock overburden. The experiment consists of eight NaI crystals (labeled C1 to C8) doped with thallium and was designed to mimic the DAMA/LIBRA setup as closely as possible. Out of these, data from three crystals was omitted due to various systematics, as discussed in CS100. Data taking commenced in October 2016 and the results released in CS100 correspond to a total exposure of 97.7 kg years. The count rates for the five crystals used for the aditi16@iiserb.ac.in shntn05@gmail.com analysis can be found in Fig. 3 of CS100. The event rates were fit to the following functional form: The first two terms in Eq. 1, consisting of the constant and exponential decay are used for parameterizing the background rates and the last cosine term is a potential signature of annual modulation caused by dark matter interactions. The data from all the crystals were simultaneously fit to the same values of the cosine function parameters, but separately for C, p 0 and p 1 using χ 2 minimization. Their results are consistent within 1σ with both the null hypothesis of no oscillation as well as with the DAMA/LIBRA annual modulation best-fit values in the 2-6 keV range. The best fit parameters for different scenarios (phase fixed as well as floating) can found in Table 1 of CS100.

OUR ANALYSIS
For our analysis, we obtained the data points and the errors associated with them from the COSINE-100 collaboration. The data consists of event rates for crystals 2, 3, 4, 6, and 7 in the 2-6 keV energy bin in 15-day intervals. We first fit only the background rates (first two terms in Eq. 1) to the data and determine the best-fit values for C, p 0 and p 1 ; this model is assumed to be our null hypothesis H 0 , i.e., We then determine the estimates of the best-fit parameters of the sinusoidal modulation in Eq. 1, and this is considered as the hypothesis to be tested, viz. H 1 . These two models are compared using frequentist, information theory (AIC and BIC), and Bayesian model comparison techniques. More details about these techniques have been recently reviewed in Krishak et al. (2019) and references therein, and we skip these details for brevity.
3.1. Parameter Estimation Parameter estimation for the models under consideration is the first step towards model comparison analysis. The data points consist of experimental errors in the event rates (σ i ). For the model with only the background signal, we find the best-fit values of the parameters using χ 2 minimization for each crystal separately. The χ 2 functional between the data (y i ) and the model function (H(t)) is given by: where y i denotes the COSINE-100 event rate in time bin i for each crystal, and H(t) is the model is defined in Eq. 2. All the background parameters are kept free, with a positive constraint (lower bound of 10 −10 ) on all of them; and the best-fit values obtained for each crystal by χ 2 minimization are summarized in Table 1.
For the model with a sinusoidal modulation (where H(t) is defined in Eq. 1)), the χ 2 minimization is done concurrently for all the crystals by using the same values of A, ω, and t 0 for all the crystals, while the background parameters can be different for each crystal. We first do a χ 2 minimization keeping all the 18 parameters free. In this case, since the time bin width we have used is equal to 15 days, we are not sensitive to periods less than 15 days. Therefore, while doing the fits, we have also used a lower bound on the period, equal to 15 days. The optimization is done using the SLSQP (Nocedal and Wright 2006) constrained optimization algorithm as implemented in scipy Python module, keeping a positive lower bound on all background parameters, as well as on the amplitude and frequency. The best fits obtained are listed in Table 2. These fits along with the data can be found in Fig. 1. The best-fit value for ω is about 0.024 radian/day (corresponding to a period of about 257 days).
For testing the DAMA annual modulation claim, we also carry out optimization of this model by keeping the modulation frequency fixed at the DAMA obtained value of 0.0172 radians/day (or period fixed at 365.25 days). We then redo the χ 2 minimization with 17 free parameters (which is one less than before), with lower bound on all background parameters and on the amplitude. The best-fit for this optimization can be found in Table 3.
The values obtained by us for the background parameters for each of the crystals in both the cases, i.e. keeping all parameters free (Table 2) and then keeping ω fixed (Table 3), differ significantly from those obtained by the COSINE-100 collaboration 1 , which is due to the degeneracy between the background parameters C, p 0 and p 1 in Eq. 1. We now present model comparison results using both these fits.

Frequentist Model Comparison
We carry out frequentist model comparison by first calculating the χ 2 values using Eq. 3 with the best-fit parameters for each model, summed over all the data points for all five crystals. Then, using the best-fit χ 2 and degrees of freedom, C(cpd/keV/kg) p 0 (cpd/keV/kg) p 1 (days)   (Wilks 1938) to quantify the p-value of the cosine model as compared to the background model. For our example, the difference in χ 2 between the two models satisfies a χ 2 distribution with degrees of freedom equal to three. From the cumulative distribution of χ 2 , we obtain the p-value from the χ 2 c.d.f. The corresponding significance or Z-score is calculated using the prescription in Cowan et al. (2011). High p-value and low Z-score indicate weak evidence against the null hypothesis. The χ 2 values per degree of freedom and the model likelihood given by the χ 2 p.d.f. calculated for each model for both the cases (ω varying and ω fixed) can be found in Tables 4 and 5 respectively along with the p-value and Z-score. As we can see, for both the cases (with ω varying and ω fixed), the difference in χ 2 between the two hypothesis is negligible. The significance of annual modulation with the same period as DAMA data is negligible (less than 1σ.)

Information Criteria
The Akaike Information Criterion value (AIC) is given by (Liddle 2007): The Bayesian Information Criterion is given by (Liddle 2007): where p is the number of free parameters, χ 2 min is the minimum χ 2 value and N is the total number of data points. The model with the smaller value of AIC and BIC is preferred. We then calculate the difference in AIC and BIC values between the H 1 and H 0 hypothesis, and evaluate the significance using the qualitative strength of evidence rules given in Shi et al. (2012). For the case with all modulation parameters free, the ∆AIC and ∆BIC values are tabulated in Table 4. We see that the modulation hypothesis has much smaller values for AIC and BIC, when ω is a free parameter. However, according to the strength of evidence rules, |∆AIC| and |∆BIC| have to be greater than 10 for the model with the smaller value to be decisively favored compared to the other. When ω is a free parameter, this criterion is not satisfied for AIC, so the better model cannot be decisively favored, whereas BIC test favors the null hypothesis. For the case of modulation period fixed, the ∆AIC and ∆BIC values are tabulated in Table 5. We get smaller values for both AIC and BIC for the null hypothesis of background-only model in this case. However, since the ∆AIC and ∆BIC values are less than 10, they also do not decisively favor any one model over the other. According to strength of evidence rules (Shi et al. 2012), BIC shows strong evidence for the background only hypothesis. Therefore, the background only hypothesis is mildy preferred using the information theory based tests.

Bayesian Model Comparison
We carry out a Bayesian model comparison by calculating the Bayes factor B 21 for the M 2 model in comparison to the M 1 hypothesis. Here, we consider the null hypothesis (H 0 ) to be M 1 and the cosine model (H 1 ) to be M 2 . The Bayes factor is given by (Trotta 2017): where P (D|M 2 ) and P (D|M 1 ) are the marginal likelihood or Bayesian evidence for M 2 and M 1 respectively given data D. Similar to the previous model comparison tests, we calculated the Bayes factor for two cases: when ω is fixed at the DAMA best-fit value as well as when ω is a free parameter. Unlike the previous three tests, this statistic does not use the best-fit value of the parameters. We first calculate the Bayesian evidence for both H 0 and H 1 using the multi-threaded Dynesty package (Speagle 2019) in Python, which uses the Dynamical Nested Sampling algorithm for calculating the Bayesian evidence (Feroz et al. 2009;Mukherjee et al. 2006). The likelihood function (P (D|M, θ)) for the combined data from all the five Fig. 2.-The CS100 data points (in black) with the fits calculated for sinusoidal modulation H 1 (t) (Eq. 1, in red) with period fixed at 365.25 days, and background-only model H 0 (t) (Eq. 2, in blue). As we can see, by eye it is hard to distinguish between the two models. The data was obtained from the COSINE-100 collaboration (private communication).
crystals, given the model and a set of parameters, is assumed to be a Gaussian: where H(t) is in the form described in Eq. 1, N is the total number of data points in each crystal, and the outer product is over the five crystals used for the analysis. We then multiply the likelihood by priors for all the background parameters as well as for A and t 0 and ω (when it is kept as a free parameter). We choose uniform priors between [0, 400], [0, 400], and [0, 30000] for the background parameters C, p 0 and p 1 respectively. These bounds are conservative and cover a huge swath of parameter space for the background parameters. Outside these bounds, the calculation of Bayesian evidence also does not always converge. For the signal parameters of the sinusoid, we use uniform priors for A between [0, 400], and [0, 360] for t 0 . When ω is kept as a free parameter, we choose a uniform prior between [0.0104, 0.428] rad/day, which corresponds to periods between 15 (bin size) and 600 days (maximum duration of the dataset).
The values of the Bayes factor for both the fits can be found in Tables 4 and 5. We find that in both the cases the Bayes factor is less than 1, indicating that the background model is favored over the cosine-based fit. We use the  TABLE 5 Summary of model comparison results using frequentist, information theoretic, and Bayesian criterion for H 0 (background only) and H 1 (background+cosine modulation with ω fixed.) According to the frequentist model comparison test, the χ 2 p.d.f (which represents the likelihood of the model given the data) for H 1 hypothesis is almost the same as that for H 0 hypothesis. The null hypothesis has a smaller value of AIC and BIC, but the difference does not cross the threshold of 10 for any one model to be decisively favored. The BIC test shows strong evidence for H 0 hypothesis. However, the Bayesian model comparison test based on the calculation of Bayes factor, provides very strong evidence for the null hypothesis using the Jeffreys scale.
Jeffrey's scale (Trotta 2017) for a qualitative interpretation of the Bayes factor. Since | ln(B 21 )| > 10, in both the cases, this provides a strong evidence in favor of the null hypothesis.

CONCLUSIONS
Recently, the COSINE-100 experiment, designed to test the DAMA/LIBRA annual modulation hypothesis, released their first results from their search for annual modulation, induced from dark matter scatterings, using 1.7 years of data, with a total exposure of 97.7 kg years . They find that the data in the 2-6 keV energy interval is consistent with both the null hypothesis of no modulation as well as with the DAMA estimate of amplitude and phase at 68.3% c.l.
In this work, we apply (similar to the analysis done in Krishak et al. (2019) for the DAMA/LIBRA data) three independent model comparison techniques, viz. frequentist, Bayesian and information theory-based, to test the compatibility of the data with annual modulation over a background-only hypothesis.
For the signal hypothesis, we did two different sets of fits. For one fit, we kept the period (or angular frequency) same as the DAMA best-fit value of one year or 0.0172 radians/day. For the other fit, the period was also kept as a free parameter.
Our results using all the three techniques are tabulated in Tables 4 and 5 respectively. When ω is a free parameter, the BIC test decisively prefer the background-only model, whereas the significance of both the hypotheses is almost the same for the AIC and frequentist test, and so its not possible to favor any one model from these tests.
When the period is fixed to the DAMA best-fit value of 1 year, we find that the BIC test strongly favors the background only hypothesis, but the difference in BIC does not cross the threshold of 10 for it to be decisively favored. With the frequentist test, the difference is negligible. With more data it remains to be seen if the significance increases with the frequentist and information theory based tests.
On the other hand, when we do the model comparison using Bayesian method of computing the Bayes factor, we find that the data strongly favors a constant background over a cosine fit (irrespective of whether ω is free or not). This is the first proof of principle application of Bayesian and information theory based model comparison techniques to the COSINE-100 data and is complementary to the statistical tests done in the COSINE-100 results paper. To promote transparency in data analysis, we have made our analysis codes and data publicly available, which can be found at https://github.com/aditikrishak/COSINE100_analysis.

ACKNOWLEDGEMENTS
Aditi Krishak has been supported by DST-INSPIRE fellowship. We are grateful to Jay Hyun Jo and the COSINE-100 collaboration for providing us the raw data used in Fig 2 and sharing with us their best-fit values of the background parameters. We acknowledge useful correspondence with Josh Speagle regarding the dynesty algorithm. We are also grateful to the anonymous referee for constructive feedback on the manuscript.