Discrete Chi-square Method for Detecting Many Signals

Unambiguous detection of signals superimposed on unknown trends is difficult for unevenly spaced data. Here, we formulate the Discrete Chi-square Method (DCM) that can determine the best model for many signals superimposed on arbitrary polynomial trends. DCM minimizes the Chi-square for the data in the multi-dimensional tested frequency space. The required number of tested frequency combinations remains manageable, because the method test statistic is symmetric in this tested frequency space. With our known tested constant frequency grid values, the non-linear DCM model becomes linear, and all results become unambiguous. We test DCM with simulated data containing different mixtures of signals and trends. DCM gives unambiguous results, if the signal frequencies are not too close to each other, and none of the signals is too weak. It relies on brute computational force, because all possible free parameter combinations for all reasonable linear models are tested. DCM works like winning a lottery by buying all lottery tickets. Anyone can reproduce all our results with the DCM computer code. All files, variables and other program code related items are printed in magenta colour. Our Appendix gives detailed instructions for using dcm.py. We also present one preliminary real use case, where DCM is applied to the observed (O) minus the computed (C) eclipse epochs of a binary star, XZ And. This DCM analysis reveals evidence for the possible presence of a third and a fourth body in this system. One recent study of a very large sample of binary stars indicated that the probability for detecting a fourth body from the O-C data of eclipsing binaries is only about 0.00005.


INTRODUCTION
The Discrete Fourier Transform (DFT), also called the power spectrum method, is one of the most frequently applied period analysis methods in natural sciences (e.g. Lomb 1976;Scargle 1982;Zechmeister & Kürster 2009). The above DFT versions rely on the assumption that the data contains no trends, and the correct model is one sinusoidal signal. In the Lomb-Scargle version, the mean of the data is removed before DFT computation, while the Zechmeister-Kürster version gives an unambiguous value for this mean. Systematic trends in the data must be removed before DFT analysis, like in the Kepler satellite light curve detrending pipelines PDC-MAP (Murphy 2012) or ARC2 (Aigrain et al. 2017). However, this removal of trends is not trivial, and it can seriously mislead the period analysis (e.g. Olspert et al. 2018).
Since DFT searches for one period at the time, we call it a one-dimensional period finding method. All trends in the data must be removed before applying DFT. After this detrending, the DFT search for many pure sinusoidal signals usually relies on "pre-whitening". In this technique, the highest DFT periodogram peak gives the best period for the detrended original data. The sinusoidal model with this best period is subtracted from these detrended data. The next second best period is determined with the DFT analysis of the residuals. This second best period gives the sinusoidal model for the residuals, and the next residuals for DFT analysis.
Countless DFT studies have been published in natural lauri.jetsu@helsinki.fi a This python program dcm.py and the other three necessary files are freely available in Zenodo database: doi 10.5281/zenodo.3661072. All files, variables and other program code related items are printed in magenta colour. Our Appendix gives detailed instructions for using dcm.py.
sciences. Since Astrophysics Data System (ADS) alone contains over four thousand citations to the DTF version by Scargle (1982), we mention only some recent astronomical DFT studies: Analysis of Kepler satellite light curves (e.g. Reinhold & Reiners 2013), Planet detection from radial velocities (e.g. Mayo et al. 2019), Variable star identification in large surveys (e.g. Pawlak et al. 2019) and Stellar pulsations (e.g. Mellon et al. 2019).
There are other period finding methods that can search for more complicated models than a simple sinusoid, like the Three Stage Period Analysis (Jetsu & Pelt 1999, TSPA) or the Continuous Period Search (Lehtinen et al. 2011, CPS). However, these methods can also only detect one signal at the time.
Our DCM can detect many signals superimposed on arbitrary polynomial trends. While DFT can detect only first order harmonic signals of pure sinusoids, our DCM can also detect much more complicated signals composed of any arbitrary order harmonics. We formulate DCM in Sects. 2 and 3, and test it with simulated data in Sect. 4. We demonstrate how DCM can unambiguously detect a sum of three sinusoids superimposed on a second order polynomial trend (Sect. 4.1), and how this best model can be identified among many alternative nested models for the data (Sect. 4.2). The consequences of searching for too many, or too few, signals are discussed (Sects. 4.3 and 4.4). Finally, we determine the data constraints for an unambiguous DCM analysis (Sect. 4.5). All these results can be reproduced with the DCM program code input and output specified in our appendix (Table A1). One DCM real use case is also presented (Sect. 5).

MODEL
The data are y i = y(t i )±σ i , where t i are the observing times and σ i are the errors (i = 1, 2, ..., n). The time span of data is ∆T = t n −t 1 . The notations for the mean and the standard deviation of y i are m y and s y . Before modelling, we subtract the first observing time t 1 from all observing times t i . Hence, the zero point in time, t = 0, is at t 1 . Our model is where This model searches for two patterns in the data: the periodic h(t) pattern that repeats itself, and the aperiodic p(t) pattern that does not. The K 1 harmonic h i (t) signals have a frequency f i and an order K 2 . The sum of these signals is superimposed on the K 3 order polynomial trend. The number of free parameters is p = K 1 × (2K 2 + 1) + K 3 + 1. The 2t/∆T argument in p k (t) ensures that the scale in the polynomial coefficients M 0 , ..., M K3 is the same as in the amplitudes B 1,1 , C 1,1 , ..., B K1,K2 , C K1,K2 of h i (t) harmonics. With this scaling, the higher polynomial orders can not dominate p k (t) >> h i (t), nor become insignificant p k (t) << h i (t), for any arbitrary unit of time t. Therefore, the simulated values of all these free parametersβ II can later be drawn from the same uniform random distribution (Sect. 4: Eq. 26).
The model residuals give the Chi-square and the sum of squared residuals For each h i (t) signal, we determine the parameters The first observing time t 1 , which is removed before modelling, is added back to the above four epochs. The P i and A i values are the same for any zero point t = 0. In our figures, we use the same colours for the frequencies, the amplitudes, the curves and the periodograms of the same h i (t) signal. We give those colours in Table 1.

METHOD
If the errors σ i are known, the test statistic of our period finding method is where χ 2 is minimized for the linear g(t) model having the fixed testedβ I = [f 1 , f 2 , ..., f K1 ] frequencies.
If the σ i errors are unknown, we use The core of DCM approach is that the test statistic z in Eqs. 10 and 11 refers indirectly to the reduced Chisquare (Barlow 1993;Andrae et al. 2010). Our DCM computer code dcm.py minimizes z. For any data, the possible alternative nested models that can be tested with dcm.py are 0 ≤ K 1 ≤ 6 ≡ From one to six periodic signals 1 ≤ K 2 ≤ 2 ≡ Harmonic signal orders 0 ≤ K 3 ≤ 6 ≡ Polynomial trend orders.
Any arbitrary pair, g 1 (t) and g 2 (t), of these nested models can be compared. We use the number of free parameters (p 1 < p 2 ), the Chi-squares (χ 2 1 , χ 2 2 ), and the sum of squared residuals (R 1 , R 2 ) of these two models to determine which one of them is a better model for the data. If the errors σ i are known, our test statistic is If these errors are unknown, we use The F χ or F R test statistic is used to identify the better model for the data. The null hypothesis is H 0 : "The model g 2 (t) does not provide a significantly better fit to the data than the model g 1 (t)." Under H 0 , both F χ and F R have an F distribution with (ν 1 , ν 2 ) degrees of freedom, where ν 1 = p 2 − p 1 and ν 2 = n − p 2 (Draper & Smith 1998). The probability for F = F χ or F = F R reaching a fixed level F 0 is called the critical level Q F = P (F ≥ F 0 ). We reject the H 0 hypothesis, if where γ F is a pre-assigned significance level. For K 1 = 2 signals, the z(f 1 , f 2 ) = z(f 2 , f 1 ) symmetry requires only the testing of f 1 > f 2 combinations. The six respective symmetries z Fig. 1d). Hence, we test only the f 1 > f 2 > f 3 > f 4 > f 5 > f 6 combinations.
In our long frequency interval search, we test an evenly spaced long grid of n L frequencies between f min = P −1 max and f max = P −1 min (Figs. 1a-f: higher longer rows). The best frequency candidates f 1,mid , ..., f K1,mid at the z minimum give the mid points for the denser evenly spaced short grids of n S tested frequencies ( Fig. 1: diamonds). The intervals of these short grids are The suitable values are a = c (f max − f min )/2, where the width is 5% ≡ 0.05 ≤ c ≤ 0.20 ≡ 20% of the long test interval (Figs. 1a-f: lower shorter rows). The best frequencies are at the global minimum of the periodogram Some graphical presentation of the full z periodogram would be possible only for the one z(f 1 ), the two z(f 1 , f 2 ) and the three z(f 1 , f 2 , f 3 ) dimensional cases. We solve these dimensional problems by presenting only the following one-dimensional slices of the full periodograms z 1 (f 1 ) = z(f 1 , f 2,best , ..., f K1,best ) z 2 (f 2 ) = z(f 1,best , f 2 , f 3,best , ..., f K1,best ) z 3 (f 3 ) = z(f 1,best , f 2,best , f 3 , f 4,best , ..., f K1,best ) (17) z 4 (f 4 ) = z(f 1,best , f 2,best , f 3,best , f 4 , f 5,best , f K1,best ) z 5 (f 5 ) = z(f 1,best , f 2,best , f 3,best , f 4,best , f 5 , f K1,best ) z 6 (f 6 ) = z(f 1,best , f 2,best , f 3,best , f 4,best , f 5,best , f 6 ) All best frequencies fulfill f i,best > f i+1,best , because we test only frequencies f 1 > f 2 > f 3 > f 4 > f 5 > f 6 . Therefore, every z i (f i ) periodogram ends at the minimum of the next z i+1 (f i+1 ) periodogram (e.g. Fig. 2: upper panel). We perform a linear least squares fit to the data with the fixed numerical values of the best frequenciesβ I,Initial = [f 1,best , f 2,best , ..., f K1,best ] detected in the short interval search. This gives us the unambiguous estimates for the values of the other free parameters β II,Initial . We determine the final estimates for the free parameters with the standard non-linear least squares iterationβ whereβ Initial = [β I,Initial ,β II,Initial ].
The errors for the model parameters are determined with the bootstrap procedure (Efron & Tibshirani 1986;Jetsu & Pelt 1999). For the original data, we test all frequency combinations within the short intervals of Eq. 15. During each bootstrap round, we select a random (a) Six highest longer rows show tested frequencies f 1 > f 2 > f 3 > f 4 > f 5 > f 6 for K 1 = 6 signals and n L = 40 in a long search between P −1 max and P −1 min (vertical red lines). Six lower shorter rows show short search frequencies for n S = 10 and c = 0.10, where diamonds denote best frequency candidates f i,mid detected in long search. Symbol colours are given in Table 1. (b-f) Arbitrary tested frequencies for K 1 = 5, 4, 3, 2 and 1 signals with given n L , n S and c combinations. sample¯ * from the residuals¯ of this best g(t) model for the original dataȳ (Eq. 7). Any i value can enter into this random sample¯ * as many times as the random selection happens to favour it. These random residuals give the artificial data sample during each bootstrap round. The best model for each artificialȳ * random data sample gives one estimate for every model parameter. The error estimate for each particular model parameter is the standard deviation of all estimates obtained for this parameter in all bootstrap rounds.

SIMULATED DATA
We test our method with simulated data.
4.1. One simulated model Here, we show that our method can detect the correct model parameter values. We illustrate this with the following model having known free parameter values where K 1 = 3 and K 3 = 2. The order of the three sinusoidal h i (t) signals is K 2 = 1. The adopted known free parameter values f 1 = 1/P 1 , f 2 = 1/P 2 , f 3 = 1/P 3 , T 1 , T 2 , T 3 , A 1 , A 2 , A 3 , M 0 , M 1 and M 2 are given in Table 2.  Table  3 data. Colours for z 1 (f 1 ), z 2 (f 2 ) and z 3 (f 3 ) are given in Table 1.
The simulated n = 500 time points t i are drawn from a uniform random distribution between 0 and ∆T = 4. Evenly spaced observations y i coinciding with the sinusoid g i = a sin 2πt i fulfill a = 2 3/2 s y , where s y is the standard deviation of y i = g i (Jetsu et al. 2013). The peak to peak amplitude of such a sinusoid fulfills A = 2a = 2 5/2 s y . This relation also holds for cosine, double sine and double cosine curves. Therefore, we compute an estimate for the peak to peak amplitude A of the simulated periodic signal from the standard deviation s y of the sum h(t i ) of all h i (t i ) signals in Eq. 20. If σ m is the mean of all data errors σ i , the signal to noise ratio SN = A/σ m = 2 5/2 s y /σ m gives σ m = 2 5/2 s y /SN (22) for the accuracy of simulated data. For our chosen fixed SN = 100 level, we draw the simulated data errors σ i from a Gaussian distribution where m = 0 and s = σ m . The numerical values for one arbitrary sample of simulated data 1 are given in Table 3. We perform the period analysis for the simulated data of Table 3 over the long tested period interval between P min = 1 and P max = 2. The test statistic z of Eq. 10 is computed for the g(t, K 1 = 3, K 2 = 1, K 3 = 2) model of Eq. 1, which will be later referred to as "model 19" (see Table 4). The z 1 (f 1 ), z 2 (f 2 ) and z 3 (f 3 ) periodograms are shown in Fig. 2, where all three minima are clearly separated. The simulated input and the detected output model parameter values are given in Table 2. They agree perfectly. This best detected model 19 for the simulated 1 Simulated Table 3 data are in file TestData.dat in Zenodo. data is shown in Fig. 3a. The residuals are stable and show no systematic trends ( Fig. 3b: blue circles). The results for the frequencies (f 1 , f 2 , f 3 ) and the amplitudes (A 1 , A 2 , A 3 ) of this best model are shown in Fig. 4. The bootstrap estimates for the frequencies and the amplitudes show linear correlations. These linear correlations indicate that any shift away from the correct model in one frequency or amplitude is compensated by a shift in all other frequencies and amplitudes.

Identifying the best model
Here, we show how the best model for the data can be identified among the many alternative nested models. In the previous Sect. 4.1, we simulated the data of Table  3 with a model having K 1 = 3, K 2 = 1 and K 3 = 2 (Eq. 20). Since we knew that this model was used in creating these data, we used the test statistic z for the g(t, 3, 1, 2) model in our period analysis. If the simulated data were real data, we would not necessarily know this correct K 1 = 3, K 2 = 1 and K 3 = 2 combination.
Let us assume that the data of Table 3 were real data, and the correct K 1 , K 2 and K 3 combination would be unknown. In that case, we would have to test numerous alternative models. Therefore, we test all 1 ≤ K 1 ≤ 4, 1 ≤ K 2 ≤ 2 and 0 ≤ K 3 ≤ 3 combinations for the data of Table 3. These 32 alternative models are compared in Table 4, where we compute the values for their statistical parameters χ 2 , F χ and Q F . We compare the "correct model 19" to all other 31 alternative models.
This correct model 19 has p 2 = 12 free parameters. The fifteen alternative models 1-13 and 17-18 have p 1 < p 2 = 12. The critical levels for all these fifteen alternative models are so low that they fall below the computational 2 accuracy of 10 −16 (Table 4: Q F < 10 −16 ). Hence, any correct model must have at least p = 12 free parameters, and we have to reject the better model H 0 hypothesis presented in Sect. 3. The correct model 19 is certainly better than any of these fifteen alternative models.
We give no F χ or Q F estimate (Eqs. 12 and 14) for model 14, because it has the same number of free parameters as the correct model 19. However, model 19 is definitely better, because its χ 2 = 496.10 is smaller than the χ 2 = 750.28 for the alternative model 14 (Table 4).
All remaining fifteen alternative models have more free parameters than the correct model 19. Hence, the number of free parameters for this model 19 becomes p 1 = 12 in Eq. 12, while that for the other models becomes p 2 .
The correct model 19 is better than the following four alternative models 15, 21, 25 and 26, because they all have higher χ 2 values (Table 4). We refer to these four models as F χ < 0, and use Q F = 1 for their critical level, because there is certainly no reason to reject the better model H 0 hypothesis presented in Sect. 3.
The remaining eleven alternative models 16, 20, 22-24 and 27-32 have lower χ 2 values than the correct model 19. However, their critical levels Q F are far above γ F = 0.001 (Eq. 14). This means that the better model H 0 hypothesis is not rejected, and the correct model 19 is also better than all these eleven alternative models.
In general, the χ 2 = 496.10 value for model 19 is already so close to n − p 1 = 500 − 12 = 488 that it is sta-  Table 3 data. (a) Data y i ± σ 1 (black circles), and g(t) and p(t) curves. (b) Data minus p(t), h(t), h 1 (t), h 2 (t) and h 3 (t) curves. Residuals (blue circles) are offset to y = −1.6 level. Colours of all curves are given in Table 1 TABLE 1 Figure notations and colours.

Frequencies, Amplitudes
Functions Circles 1 h 6 (t) Continuous line -z 6 (f 6 ) Continuous linetistically impossible to reach significant high F χ values with more complex p 2 > p 1 models, because increasing p 2 can not actually decrease χ 2 a lot. The second best model for the data is model 20 reaching Q F = 0.078 (Table 4). It resembles the correct model 19. Except for the third order polynomial coefficient M 3 , the other free parameters of this model 20 are exactly the same as those of the correct model 19 ( Table 2). As explained in Sect. 2, the scale of all polynomial trend p(t) coefficients M 1 , M 2 and M 3 is the same, which means that equal absolute values for these coefficients cause the same p(t) change during ∆T . The |M 3 | = 0.21 coefficient of model 20 is much smaller than the |M 1 | = 1.1 and |M 2 | = 1.8 coefficients, which means that the first and second order trends dominate over the third order trend. The P 1 = 1.104 ± 0.003 period of model 20 agrees with the simulated P 1 = 1.1 period of the g S1 (t) model, but the results for the P 2 and P 3 periods do not. These results confirm that even a minor deviation away from the correct p(t) trend can mislead the period analysis.  Model 20  We conclude that the best model 19 for the data can be unambiguously identified among all alternative 32 nested models.

Searching for too many signals
The simulated data of Table 3 contains only three signals. Here, we check what happens, if four signals (i.e. too many signals) are searched for in these data.
These dispersing large amplitude curves nearly cancel out each other, which gives a reasonable χ 2 = 490.95 value ( Table 4). The bootstrap results show that the dotted frequency error lines intersect the thick green continuous f 1 = f 2 and f 2 = f 3 diagonal lines (Fig. 7). We  Table 3 data. Otherwise as in Fig. 2. refer to this as Intersecting frequencies.
Model 27 fails, because the data do not contain four signals, but only three. Actually, all four signal models consistently fail. Nearly two thirds of the two, the three and the four signal models fail (Table 4: Fail="Yes" for 15 models out of 24). All these failed models are just an additional proof for that model 19 is the best model for Table 3 simulated data. In fact, we could have rejected these failed models without ever computing their χ 2 estimates. Furthermore, the rejected one signal models 1-8 with very high χ 2 can not have Fail="Yes" or "No". These one signal models simply can not have "intersecting frequencies" or "dispersing amplitudes", because this plural alternative is impossible. For these one signal models, there is no need for applying the frequency and the amplitude criteria, which will be introduced later (see Eqs. 28 and 29). An unambiguous separation between the signal and the trend is easiest when the one signal model is the correct model.

Finding too few signals
The simulated data of Table 3 contains three signals. Yet, the two signal K 1 = 2, K 2 = 1 and K 3 = 0 model 9 periodograms z 1 (f 1 ) and z 2 (f 2 ) merge, and show only the minimum of one period (Fig. 8). The black g(t) curve of this model 9 makes no sense (Fig. 9a). The red h 1 (t) and the blue h 2 (t) signal curves disperse, and the blue residuals show regular variation (Fig. 9b). The f 1 and f 2 frequencies intersect, and the A 1 and A 2 amplitudes disperse (Fig. 10). This model fails, because the use of K 3 = 0 order p(t) polynomial totally ignores the real trend in the data. This idea is supported by the fact that all two, three and four signal models having K 3 = 0 consistently fail (Table 4: models 9, 13, 17, 21, 25 and 29).
These results show that even if two signals are not detected in the data, this does not mean that the correct number of signals can not be three or even more. The detection of the correct number of signals depends on  Table 3 data. Otherwise as in Fig. 3. the selection of the correct trend. Wrong p(t) trend can eliminate real signals. The results in Table 4 indicate that the false detection of too many signals is inprobable, because all four signal models 25-32 fail. For all 32 nested models of Table 4, the false detection of too few signals is more probable than the false detection of too many signals, because the two signal models fail only two times out of eight (only models 9 and 13), but the four signal models fail eight times out of eight (all models 25-32) 4.5. Many simulated models Here, we create artificial data with many simulated models having random signal frequencies. We show that our method can retrieve the known input parameters of these models. The K 1 simulated f i frequencies are selected from a uniform random distribution between f min = 1/P max and f max = 1/P min , where P min = 1 and P max = 2. These random frequencies are rearranged    Table 3 data. Otherwise as in Fig. 4.
The above signal amplitudes and polynomial coefficients  Table 1. giveβ II for the simulated g(t) model (Eq. 1). All free parameters of this simulated g(t) model areβ = [β I ,β II ].
The simulated n = 500 time points t i are drawn from a uniform random distribution of Eq. 21, where ∆T = 4.
The chosen signal to noise ratio SN and the standard deviation s y of all h(t i ) give the accuracy σ m of the simulated data (Eq. 22). The n errors σ i for the simulated data are drawn from the Gaussian distribution of Eq. 23.
Finally, the simulated data are We use the three signal model 18 (K 1 = 3, K 2 = 1, K 3 = 1) to produce simulated data y i of Eq. 27. Our sample size is n = 500 and the signal to noise ratio is SN = 100. The results for thirty model 18 simulations are shown in Fig. 11. If this DCM analysis of ours succeeds, the simulated frequencies f i,sim and the detected frequencies f i,det in these samples should coincide with the continuous equal value diagonal lines.
The transparent diamonds in Fig. 11 highlight models having at least one simulated frequency pair that fulfills where f crit = 0.05 and i = 1 or 2. These signal frequencies differ less than ±5% in the tested frequency range between f min and f max . The models for these particular simulated samples may fail due to the dispersing amplitudes and the intersecting frequencies discussed in Sects. 4.3 and 4.4. As expected, some of these highlighted detected frequencies f i,det and amplitudes A i,det do deviate from the equal value levels in Fig. 11. The transparent circles in Fig. 11 highlight models where A crit = 0.5 and A max is the highest value of all signal amplitudes A i (i = 1, 2, 3). For these particular simulated samples, signal detection becomes more difficult, because at least one signal is two times weaker than the strongest signal. Again, as expected, the detected frequencies f i,det and amplitudes A i,det for some these highlighted samples deviate from the diagonal equal value   Fig. 11.

lines in
Clearly, the simulated model signal frequencies (Eq. 28: f i,sim ) and amplitudes (Eq. 29: A i,sim ) determine the success of DCM analysis. We can confirm this important result from the relative error It measures the error for the detected frequency f i,det in the units of the simulated frequency f i,sim . We compute the mean of relative errors σ fi,rel for m = 100 simulated samples y i produced with model 18, n = 500 and SN = 100 (Eq. 27). The results are given in Table 5 (Lines 1-3). The mean of relative error σ fi,rel decreases when models fulfilling criterion of Eq. 28 are removed (m = 100 → 69). It decreases even more when models fulfilling criterion of Eq. 29 are also removed (m = 69 → 48). These general results are consistently confirmed with doubled signal to noise ratio SN = 100 → 200 (Table 5: Lines 4-6) and doubled sample size n = 500 → 1000 (Table 5: Lines 7-9). This confirms that our DCM can detect the correct frequencies when 1. Signal frequencies are not too close (Eq. 28).
2. None of the signal amplitudes is too weak (Eq. 29).
3. Sample size n and signal to noise ratio SN are sufficient (Table 5).
If these correct frequencies are detected, the values for the remaining other model parameters will also be cor-  5 Mean of relative errors σ f 1 ,rel , σ f 2 ,rel and σ f 3 ,rel for one hundred model 18 simulations with n = 500 and SN=100. Lines 1-3 give results for all models (m = 100), when Eq. 28 criterion models are removed (m = 100 → 69), and when Eq. 29 criterion models are also removed (m = 69 → 48). Signal to noise ratio SN is doubled on next three lines.
Sample size n is doubled on next three lines. Eqs. 28 and 29 0.0034 0.0077 0.0041 40 rect, because linear modelling is always unambiguous. Nevertheless, failing to detect even a single correct frequency can seriously mislead the period analysis (e.g. Figs. 6 and 9).

REAL USE CASE
We also provide one example of preliminary real case use of DCM.
Periodic changes occur in the observed (O) minus the computed (C) eclipse epochs of binaries. The most probable causes for such periodicities are a third body (e.g. Li et al. 2018), a magnetic activity cycle (e.g. Applegate 1992) or an apsidal motion (e.g. Borkovits et al. 2005). Recently, Hajdu et al. (2019) searched for third bodies in a large sample of about 80 000 eclipsing binaries. The model in their one-dimensional period analysis of O-C data was a pure sinusoidal signal superimposed on a second order polynomial (Hajdu et al. 2019, Eq. 11). They discovered 992 hierarchial triple systems, but only four candidates possibly having a fourth body. In other words, the probability for finding a fourth body in their large sample was about 4/80 000 = 0.00005.
Our data are the observed (O) minus the computed (C) primary minimum epochs of XZ And, which were retrieved from Lichtenknecker-Database of the BAV 3 in September 2019. The primary (A4 IV-V, 3.2M , 2.4R ) and the secondary (G IV, 1.3M , 2.6R ) of this binary orbit each other during P orb = 1.357 days (Demircan et al. 1995 Jetsu 12. We detect the periods P 1 = 13418 d =37 y and P 2 = 32192 d =88 y . These two signals are shown in Fig.  13. For example, Demircan et al. (1995) and Chaplin (2019) have also detected the P 1 periodicity. It seems that our DCM can easily detect evidence for the possible presence of a fourth body in XZ And. This result illustrates the potential of DCM in detecting many signals superimposed on unknown polynomial trends. From the DCM point of view, the most important currently unsolved questions are 1. How many K 1 signals do these data contain? 2. What is the correct K 3 trend?
Since the aim of the current paper is to present DCM, the final results of this preliminary real use case will be presented in a future work. For this reason, no DCM files of this case are published here.

DISCUSSION
The main point of DCM is that our periodic non-linear model becomes linear when the grid of constant tested frequencies is fixed. All analysis results become unambiguous, which guarantees the success of DCM. Actually, we can now present a general numerical solution for any non-linear g(t,β) model. This non-linear model may be periodic, aperiodic, or combination of both, like the DCM model. Our simple recipe is 1. Divide the free parametersβ to two parts: a: Those that make the model nonlinear =β I b: The rest of the free parameters =β II 2. Fix the testedβ I grid.
3. Test all reasonable linear models.
4. Identify the best model among these models.

Solve the model parameter errors with bootstrap.
Another main point of DCM is the z test statistic symmetry in the K 1 -dimensional frequency space. Without this symmetry, our period search would literally resemble the search for a needle in a haystack for higher number of signals. For example, the six signal models have K 1 ! = 6! = 720 symmetries, which give the same number equally good alternative z periodogram minima in sixdimensional frequency space. This z symmetry allows us to test only a single frequency combination, and to get rid of the other irrelevant K 1 ! − 1 frequency combinations. Whatever the correct real frequency values may be, they can always be rearranged into a decreasing order f 1 > f 2 > ... > f K1 . Therefore, we test only the combinations of all those one-dimensional frequency intervals that do not overlap. We never have to bother about the rest (K 1 ! − 1)/K 1 ! of the entire frequency space, because nothing new can be found out there.
All periodogram minima of model 19 are steep in Fig.  2, which means that χ 2 with the initial estimateβ Initial is already very close to its possible minimum value before the non-linear minimization iteration of Eq. 18 even begins. The z i (f i ) periodograms in all Figs. 2, 5 and 8 display no sudden jumps, because there is strong correlation between the χ 2 values for tested frequencies close to each other. If the grid of tested constant frequencies is already sufficiently dense, there is no sensible "escape" away from the minima of these continuous, stable and unambiguous z i (f i ) periodogram curves for linear models. Thus, the non-linear iteration of Eq. 18 can not very much improve the χ 2 estimate, because the search for the best model solution is already nearly over. For example, the χ 2 estimates for all 32 models of Table 4 are practically the same with, or without, the non-linear iteration alternative 4 when the frequency grid parameters are fixed to n L = 60, n S = 30 and a = 0.20. Accurate model parameter estimates, including the frequencies, can already be obtained with linear models when the tested frequency grid is not too sparse. We conclude that the nonlinear iteration of Eq. 18 is not always needed. However, the word "Discrete" in our DCM abbreviation could be replaced with the word "Continuous" when this non-linear iteration of Eq. 18 is applied.
There are correlations between the signal frequencies and amplitudes of the correct model 19 (Fig. 4). If an estimate for even one of these parameters shifts away from the correct value, the remaining other estimates tend to compensate this shift with their own shifts away from their correct values. These shifts may mislead the DCM period analysis, or at least increase the bootstrap error estimates, if the tested frequency grid is too sparse in the long or the short search, or in the bootstrap. This possibly misleading effect can be eliminated with denser tested frequency grids, but then the detection of many signals requires a lot of computation time, because the total number of tested frequency combinations is proportional to n K1 L and n K1 S . However, this "wasted" 5 computation time becomes irrelevant, if the correct frequencies are detected, because the results for all other model parameters become unambiguous. The patience required in testing all possible parameter combinations, as well as all reasonable linear models, is amply rewarded.
We identify the best model for the data among all 32 alternative nested models with the simple χ 2 -test and/or the standard Fisher-test (Sect. 4.2). The correct model 19 has p = 12 free parameters. We can establish this required minimum number of free parameters with absolute certainty, because the critical levels for all fifteen models having less than p = 12 free parameters are below the computational accuracy (Table 4: Q F < 10 −16 ). The χ 2 and/or Q F values for all sixteen remaining alternative p ≥ 12 models confirm that model 19 is the best model. Furthermore, model 19 is certainly better than the fifteen failed many signal DCM models, which can be easily identified from their dispersing amplitudes and intersecting frequencies (Sects. 4.3 and 4.4).
While DFT can detect only pure sinusoidal signals, our DCM can also detect more complicated h i (t) signals. Except for the trivial n + 1 ≥ p condition, there is no theoretical K 2 upper limit for the signal order that can be detected with the DCM. Our dcm.py code can detect only K 2 = 1 and K 2 = 2 order signals. However, the users of our code should be aware of the problems arising in the DCM period search for higher order K 2 > 1 signals. These signals are sums of (32) where j = 1, ..., K 2 and P i = 1/f i . We denote the peak to peak amplitudes of these h i,j (t) signals with A i,j . The following problems arise with these higher order models: 1. Let us assume that the second order K 2 = 2 model is the correct model for the data.
1a. If the A i,1 /A i,2 ratio of the h i (t) signal approaches zero, the K 2 = 2 order model may detect the correct P i period, or the wrong half P i /2 period. Both of these periods are equally good for A i,1 /A i,2 = 0.
1b. If the h i (t) signal ratio A i,2 /A i,1 approaches zero, the K 2 = 2 order model may detect the correct period P i , or the wrong double period 2P i . For A i,2 /A i,1 = 0, both periods are equally good.
1c. If a wrong model, like the K 2 = 1 model, is applied to these data, the correct P i period may, or may not, be detected. Then again, the one-dimensional DTF analysis based on this K 2 = 1 model also suffers from these half and double period 1a-c problems (e.g. Reinhold & Reiners 2013).
2. Evidently, more complicated problems than the above 1a-c problems arise with the higher K 2 > 2 order signals. We emphasize that an unambiguous solution for the these problems directly from the periodograms of a single model can fail, because this model may not be the correct model for the data. Nevertheless, there is an unambiguous solution for these problems afterwards: the Fisher-test comparison of many alternative nested models. It would have been possible for us to code a DCM version that first tests all chosen K 1 , K 2 and K 3 model combinations, and then performs the above Fisher-tests, like the comparison of 32 nested models in our Table 4. We decided not to code this tedious alternative into our current dcm.py version, because the users can compare the DCM results for different nested models with our fisher.py program.
3. For all these higher K 2 > 2 order complex signals, the probability for detecting a wrong period P i for any h i (t) signal also depends on the quality (σ i ) and the quantity (n) of data, as well as on the frequencies of the other K 1 − 1 signals (Eq. 28), and the amplitudes of these other signals (Eq. 29).
4. There is also one computational aspect, why we decided not to code the cases K 2 > 2 into dcm.py. It would be easy to compute the numerical bootstrap error estimates for the signal periods and amplitudes of higher order K 2 > 2 models. However, the bootstrap error estimates for the minimum and the maximum epochs would not be easy to compute for these K 2 > 2 models. For example, a secondary minimum may be present or absent in some K 2 = 2 model bootstrap samples. Or, the primary and the secondary minima may switch in some bootstrap samples, if the double wave of the K 2 = 2 model has two equally deep minima. In fact, we had already solved these K 2 = 2 model primary and secondary minimum epoch problems 6 earlier (e.g. Jetsu & Pelt 1999, Figs. 2 and 4). The bootstrap solutions for these minimum and maximum epoch errors are much more complicated for K 2 > 2 models.
DCM solves these tasks more directly than DFT: 1. One signal data without trends: DFT finds the correct period. Then the data are modelled with a sinusoid having this period. DCM achieves this directly with the g(t, 1, 1, 0) model.

One signal data with trends:
After trend removal, DFT may, or may not, find the correct period. Then a sinusoid with this period is fitted to the detrended data. DCM achieves this directly with the g(t, 1, 1, K 3 ) model for any K 3 :th order polynomial trend.
3. Many signal data with trends: After trend removal, the DFT pre-whitening technique may, or may not, determine the correct sequence of periods one after another. Then sinusoids having these periods are fitted to the detrended data. DCM achieves this directly for any order of polynomial trends, any number of signals, and any harmonic order signals, including one harmonic order pure sinusoidal signals.
In all these cases 1-3, DFT and DCM both obtain the final result by minimizing the χ 2 test statistic. DCM can find the global χ 2 minimum, if all differences between signal frequencies are not too small (Eq. 28), and none of the signal amplitudes is too low (Eq. 29). We show that in this case the correct simulated frequencies can be detected, and their accuracy is only improved for higher signal to noise ratio SN and larger sample size n (Fig. 11, Table 5). When these correct frequencies are detected, all other model parameters are also correct, because their linear least squares fit solutions are unambiguous.

CONCLUSIONS
The frequently applied Discrete Fourier Transform (DFT) can detect periodicity in unevenly spaced data. Unambiguous signal detection succeeds only if the data contains no trends and a sinusoid is the correct model (Lomb 1976;Scargle 1982;Zechmeister & Kürster 2009). DFT can not directly detect many signals superimposed on unknown trends, but our Discrete Chi-Square Method (DCM) can. Our model for the data is the sum g(t) = h(t) + p(t), where h(t) contains the signals and p(t) is the polynomial trend. The former periodic part repeats itself, but the latter aperiodic part does not. Our g(t) model is non-linear, but it becomes linear when the frequencies of h(t) are fixed to their constant numerical tested frequency grid values. These linear models give unambiguous results. We spoil the fun of traditional time series analysis with our brute numerical approach, because we test all possible free parameter values for all reasonable linear models. We can also identify the best model for the data among all alternative nested models, and show when the correct frequencies can be detected (Eqs. 28 and 29). If these detected frequencies are correct, all other model parameters are also correct.
Anyone can test our DCM code, but just like any any other period finding method code, it has its statistical limitations. Since there will always be challenging problems with real data, we also code the DCM alternative for analysing simulated data 7 similar to the users' own real data.
We have now formulated, tested and coded DCM. However, we leave the tedious comparison between DCM and DFT to our next study.
We thank Dr. Karri Muinonen for his comments about numerical nonlinear least squares iteration routines. We also thank Dr. Thomas Hackman for his comments.

Control file variables
In this section, we use the same numbering of the control file dcm.dat variables as in this file itself. The users can easily find the description of each variable, because their numbers below are highlighted with yellow background. This same numbering is also used in the figure and table reproduction information of Table A1. 1 Tag is text written to the beginning of the names of all output figures and files. This parameter Tag allows the user to store the results of each particular dcm.dat analysis into figures and files having the chosen specific names. For our chosen Tag = Dec2019 in dcm.dat, those output figures are Dec2019z.eps (Fig. 2) Dec2019gsim.eps (Only if RealData = 1 and SimMany = 1, or SimMany = 1) Dec2019gdet.eps (Fig. 3) Dec2019fA.eps (Fig. 4) Dec2019Many (Fig. 11: Only if SimMany = 1) The output files are Dec2019Params.dat (Analysis results: parts of Table 2) Dec2019Residuals.dat (Residuals file) Dec2019Model.dat (Model file) Dec2019AllBeta.dat (Free parameter file) Dec2019ManyfA.dat (Relative frequency error file: Only if SimMany = 1) The contents of these output files are described later in this appendix.
2 RealData is used to select the analysed data. Its value is relevant only in Modes 1 and 2 when SimMany = 1.
RealData=1 activates the Mode 1 of program dcm.py, where it analyses the real data given in file file1. RealData = 1 activates the Mode 2 of program dcm.py, where it creates simulated data, stores these data to a file and analyses these data. It also creates a figure of the simulation model and the simulated data.
For the Tag = Dec2019 in our dcm.dat, the name of the input data figure is Dec2019gsim.eps. The simulated and analysed data file is Dec2019SimulatedData.dat. The output figure is Dec2019gdet.eps. The user can test many problems encountered with real data by simulating data having the same time span ∆T =SimDT, sample size n =SimN and signal to noise ratio SN =SimSN as the real data. For SimT = 1, the time points for the simulated data are the same as for the real data in file1. With our Tag=Dec2019, the comparison between simulated Dec2019gsim.eps and detected Dec2019gdet.eps figures reveals directly, if DCM succeeds.
3 file1 is the name of the file containing the real data analysed in Mode 1 (RealData = 1 and SimMany = 1).
Its format must be the same as in our Table 3. Our TestData.dat file in Zenodo database contains the numerical values of Table 3. In this paper, we analyse these Table 3 simulated data with dcm.py, and show our results in Figs. 2-10 and Table 4. All these results can be reproduced with the input variable values given in Table A1.
4 dummy is the value for input and output which contains no information. We use dummy=-99.999.
None of the analysed real or simulated observations Y= y i should have the numerical value of dummy. None of the model parameters should have the numerical value of dummy. 5 K1= K 1 = 1, 2, 3, 4, 5 or 6 signals (Eq. 1) 6 K2= K 2 = 1 or 2 signal order (Eq. 1) 7 K3= K 3 = 0, 1, 2, 3, 4, 5 or 6 order polynomial trend (Eq. 1) 8 nL= n L = number of tested frequencies in long search 9 nS= n L = number of tested frequencies in short search 10 c= c = width of short tested frequency interval (Eq. 15) 11 TestStat is used to select the test statistic z= z.

Residuals file
The i residuals of the model are stored into the residuals file, which is Dec2019Residuals.dat with our Tag = Dec2019. Columns 1-3 are T= t i , EPSILON= i = y i − g i , EY= σ i . The format is the same as in the real data file file1, i.e. the format of this residual file is immediately suitable for further dcm.py period analysis.

Model file
This file is Dec2019Model.dat with our Tag = Dec2019. Columns 1-4 are T= t i , Y= y i , EY= σ i and G= g i .

Free parameter file
This free parameter file is Dec2019AllBeta.dat with our Tag = Dec2019. It contains 39 columns. The first two columns are TEPOCH= t 1 and DELTAT= ∆T . For the free parameters β i , our dcm.py program uses the index values i given in Table A3. All columns having dummy values contain no β i value. If these dummy columns are removed, the remaining i columns are the β i free parameters specified in Table A3. If necessary, the user can derive all functions of the g(t) model from these parameters t 1 , ∆T and β i by using Eqs. 1 -5.
In Mode 1 (SimMany = 1, RealData = 1), the first line contains the parameters t 1 , ∆T and β i for the original data sample in file1. Other lines contain these parameters for the bootstrap data samples.
In Mode 2 (SimMany = 1, RealData = 1), the first line contains the t 1 , ∆T and β i parameters for the original simulated data sample. Other lines contain these parameters for the bootstrap data samples.

Relative frequency error file
This file is produced only when SimMany = 1 (Mode 3). With our Tag = Dec2019, its name is Dec2019ManyfA.dat. Its contents are also printed in the screen when PrintScreen = 1. This file gives the relative frequency errors (Eq. 30). The results are given separately for all models, as well as for those models that do not fulfill criteria of Eqs. 28 and 29, like models not highlighted in Fig. 11.
We show dcm.dat and Dec2019Params.dat files in the end of this appendix, but not Dec2019Residuals.dat, Dec2019Model.dat, Dec2019AllBeta.dat and Dec2019ManyfA.dat files, because they can be created with the python dcm.py command.

Reproducing our results
Here, we explain how the users can reproduce our main results, and at the same time practice the use of our CDM program dcm.py. After reproducing our results, the users can also be more confident about results for their own data.
Col. 1 of Table A1 gives the input parameter values in dcm.dat that reproduce the results in our Figs. 2-4. File dcm.dat from Zenodo database contains the same input values as Col. 1 of Table A1. Table A1 shows how the results in our Table 4 can be obtained by adjusting the K1= K 1 , K2= K 2 and K3= K 3 values in dcm.dat. Those particular adjusted values are denoted with * in Table A1.
Col. 6 shows how Table 5 can be reproduced by adjusting SimN and SimSN values marked with "*".
The users' results for Figs. 2, 3, 5, 6, 8 and 9 should be identical. However, the results for Figs. 4, 7 and 10, as well as the error estimates in Table 2, will never be identical, because the random bootstrap residuals i are always different in Eq. 19. The results in Fig. 11 and Table 5 will also always differ, because the created random data samples y i of Eq. 27 are never the same.

Fisher test
In this section, we explain how our program fisher.py computes the F χ and Q F estimates in Table 4. Executing python fisher.py asks for the numerical input values for n= n, p1= p 1 , p2= p 2 , Chi1= χ 1 or R1= R 1 , and Chi2= χ 2 or R1= R 2 . If the pair χ 1 and χ 2 is used, the F= F χ value is computed from Eq. 12. For the R 1 and R 2 pairs, the F= F R value is computed from Eq. 13. The program prints the F value and the Q= Q F value of Eq. 14. The results for comparing models 19 and 20 are shown in Fig. A1.

Least squares fit subroutine
The numerical python least squares subroutine optimize.leastsq minimizes the sum of squares For the 32 nested models of Table 4, the χ 2 estimates for the linear models of denser tested frequency grids agree with the results for the non-linear model (Eq. 18). In the python optimize.leastsq subroutine, the parameter ftol measures the relative error in the above sum of squares. The parameter xtol measures the relative error in the desired approximate solution. We use ftol=0.0001 and xtol=0.0001 in optimize.leastsq of our non-linear least squares fit NonLinearLSF, because this should prevent the numerical non-linear iterationβ final of Eq. 18 from wandering too far from the unambiguous initialβ initial estimate obtained from linear modelling.
Amplitude dispersion occurs already for the unambiguous linear models, before any non-linear modelling is made. The numerical optimize.leastsq subroutine has to utilize these unrealistic high amplitude h i (t) curves, because this is the only possible way to minimize χ 2 . These high amplitude curves, which nearly cancel out each other, offer the only possible way to fit two or more curves having nearly the same frequencies. From the purely mathematical point of view, these amplitude dispersion models do not fail. They are just unrealistic. There just are no reasonable low amplitude solutions. However, we know for certain that these unrealistic models fail, because we know that model 19 is the correct solution.
The optimize.leastsq is not an analytical subroutine, because it does not require the model partial derivative formulas as its input. Here is room for development for those who are prepared to code these ∂g(t)/∂β i partial derivatives. But even that analytical solution could not eliminate amplitude dispersion, because the continuous and stable z 1 (f 1 ), ..., z K1 (f K1 ) periodograms already confirm that there simply are no realistic low amplitude h i (t) curve solutions (e.g. Figs. 5 and 8).

Qualitative program code description
We end this appendix with a short qualitative description of the stages of dcm.py.
1. First, the long tested frequency interval f min = P −1 max and f max = P −1 min is fixed. We create the n L evenly spaced tested frequencies f 1 , f 2 , ..., f K1 between f min and f max . Such grids are illustrated in Fig. 1. The j:th tested value of frequency f i is denoted with f i,ji (j i = 1, 2, ..., n L ).
2. We create the one-dimensional vectors F 1 , F 2 , ...F K1 , Z for collecting the period search results from the K 1dimensional tested frequency space. These vectors are empty before the loop of tested frequencies begins.
(b) The tested frequency combination and the result for z are appended into the collection vectors . The more accurate best frequency values are determined by using these dense grids of n S tested frequencies.

TABLE A3
Programming indexes i for free parameters β i . These indexes are given separately forβ I andβ II . Columns "L" and "N" denote Linear and Non-linear models for given K 1 and K 2 combinations.