The X-ray Variability of AGN and its Implications for Observations of Galaxy Clusters

The detection of new clusters of galaxies or the study of known clusters of galaxies in X-rays can be complicated by the presence of X-ray point sources, the majority of which will be active galactic nuclei (AGN). This can be addressed by combining observations from a high angular resolution observatory (such as Chandra) with deeper data from a more sensitive observatory that may not be able to resolve the AGN (like XMM). However, this approach is undermined if the AGN varies in flux between the epochs of the observations. To address this we measure the characteristic X-ray variability of serendipitously detected AGN in 70 pairs of Chandra observations, separated by intervals of between one month and thirteen years. After quality cuts, the full sample consists of 1511 sources, although the main analysis uses a subset of 416 sources selected on the geometric mean of their flux in the pairs of observations, which eliminates selection biases. We find a fractional variability that increases with increasing interval between observations, from about 0.25 for observations separated by tens of days up to about 0.45 for observations separated by $\sim 10$ years. As a rule of thumb, given the precise X-ray flux of a typical AGN at one epoch, its flux at a second epoch some years earlier or later can be predicted with a precision of about $60\%$ due to its variability (ignoring any statistical noise). This is larger than the characteristic variability of the population by a factor of $\sqrt{2}$ due to the uncertainty on the mean flux of the AGN due to a single prior measurement. The precision can thus be improved with multiple prior flux measurements (reducing the $\sqrt{2}$ factor), or by reducing the interval between observations to reduce the characteristic variability.


INTRODUCTION
Active Galactic Nuclei (AGN) are among the brightest and most abundant X-ray sources on the sky. While they are extremely valuable sources to study in their own right, they can be a nuisance when their emission is projected onto another object of interest. This is not uncommon in X-ray observations of clusters of galaxies where the emission from AGN in, or projected onto, the cluster can be significant relative to the emission from the cluster. If unresolved, the emission from such AGN can bias X-ray measurements of the properties of the intra-cluster medium (e.g. Hilton et al. 2010), or bias the detection of clusters in X-ray surveys, boosting their detection probability or leading to them being missed altogether (Giles et al. 2012;Somboonpanyakul et al. 2018).
When high angular resolution X-ray imaging with Chandra is available, any such contaminating AGN can be resolved and excised efficiently. However, in many applications such as deep observations of distant clusters (where the greater effective area of XMM-Newton is needed), surveys (where the greater grasp -the product of effective area and field of view -of XMM-Newton or eROSITA is needed), or observations of cluster outskirts (where Suzaku's lower and more stable background is needed) Chandra is not the optimal primary instrument. ben.maughan@bristol.ac.uk In such cases it is possible to use Chandra observations of the field to detect and characterise AGN so that they may be masked, subtracted or modelled in other data (e.g. Hilton et al. 2010;Thölken et al. 2016).
The problem with this approach is that the vast majority of AGN show significant variability in their X-ray flux on timescales from days to years (Paolillo et al. 2004). This introduces an extra source of uncertainty due to the variability of an AGN between the epoch of the Chandra observation in which it was characterised and the epoch of the observation in which it must be modelled.
Early work such as ; Nandra et al. (1997); Almaini et al. (2000) investigated the variability of AGN on timescales of days and weeks finding variability between about 10% and 40%. More recent work has extended the baseline of observations to probe variability on timescales up to 20 years (Paolillo et al. 2004;Mateos et al. 2007;Vagnetti et al. 2011Vagnetti et al. , 2016Middei et al. 2017), finding variability increasing on longer rest-frame timescales, up to about 50% on timescales of order 10 years (Middei et al. 2017).
Where most previous investigations of AGN variability on long timescales have been motivated by the study of the AGN themselves, our motivation and hence our approach is quite different. In particular, we put ourselves in the position of an observer who is faced with a contaminating AGN and we set out to answer the simple We do this by assembling a sample of AGN that have each been observed by Chandra on two occasions, and then model the pairs of fluxes to constrain the characteristic variability of the population. Our method improves on previous analyses by employing a sample selection that includes non-detections and avoids biasing the inferred variability, and by using the Poisson statistics of the observed counts to allow for the inclusion of upper limits in the analysis.

SAMPLE CONSTRUCTION AND DATA REDUCTION
In this section we describe the sample definition, data reduction and analysis used to measure the photon counts in source and background regions, and hence fluxes for the AGN in our sample. Our aim was to find pairs of overlapping Chandra observations in order to determine the fluxes of serendipitously observed AGN at two different epochs. To do this, we considered all public Chandra observations available as of 4th July 2016. We then selected only ACIS-I observations, and considered pointings whose aim points matched within 5 to ensure a reasonable overlap in area. In order to minimise the occurrence of non-AGN point sources in the data, we excluded pointings within ±20 • of the Galactic plane and observations of Galactic targets or nearby galaxies. We then required observations to have exposures of at least 20 ks to ensure a reasonable depth. Finally, we defined pairs of matching observations whose observation time was separated by at least 25 days. This was chosen so that the separation is at least a factor of 10 larger than the duration of any of the individual observations.
Where there are a large numbers of observations of a given field, we selected the longest observations in 25 day blocks and then paired each with the next available observation separated by at least 25 days, and then repeated until none are left. All pairs are independent (no observations belong to more than one pair), and if the same AGN is detected in multiple pairs of observations, only the pair with the longest interval between observations is used for that AGN.
This process led to a final dataset of 70 pairs of observations, which are summarised in Table 1. Fig. 1 shows the time intervals between observations used to construct the sample. The shortest interval was 27 days and the longest was 4743 days.
As our aim is to determine variability measurements that can be used to model the flux of an AGN about which nothing other than its flux and date of observation is known, we make no attempt to cross match our sample with known AGN. This means that we do not assume knowledge of the redshift of any sources; we work in terms of the observed flux and unless noted otherwise, all timescales are in the observer's frame. This also means that, strictly speaking, we should refer to the sources we analyse as "X-ray point sources", but given our selection of fields, the point sources will be dominated by AGN, and we refer to them as AGN throughout.
2.1. Chandra data analysis The Chandra observations were reduced and analysed using CIAO version 4.8.2 and CALDB version 4.7.2. The data were reduced following the standard procedures using the chandra_repro script, and the deflare tool to remove periods of high background.
Next, for each pair of observations the astrometry of the observations was corrected to ensure they matched closely 1 . This is necessary, as in many cases a source will be detected in one observation but not in the other, requiring forced photometry at the source coordinates. If the two observations had a small offset in astrometry then the aperture would be offset from the source position in the observation in which the source was not detected. This would artificially reduce the inferred source flux, biasing the apparent variability to be higher than the true variability.
To perform the astrometric correction, the longer of the two observations was defined as observation 1, and the shorter as observation 2. Sources were detected in each observation in the 0.5 − 7 keV energy band using the CIAO wavdetect tool, and those detected with at least 7σ significance were used to register the images. Observation 2 was corrected to match the astrometry of observation 1. In all but three pairs of observations, at least 10 sources were available to register the images, with at least 6 sources used in the other three pairs. The size of the astrometric correction was typically small. The median correction was 0.3 , and was less than 1 in all but three cases (the correction was smaller than 1.5 in those cases).
Source detection was then repeated on the corrected observations, and the source lists for a pair of observations were compared. Sources whose positions matched within 1 between the two observations were considered to be detections of the same source, and we refer to such a source as a "detected pair" (we demonstrate later that using a more conservative matching radius of 0.5 has no impact on our results). Sources detected in one observation which had no matching source within 10 in the other observation were considered to be undetected in the second observation, and such a source is referred to as a "detected/undetected pair". Detected sources with a position match between 1 and 10 are excluded as likely spurious matches.
Next, aperture photometry was performed using the CIAO srcflux tool in the 0.5−2 keV band. The apertures used for the source regions were circles with a radius enclosing 90% of the PSF at 1.25 keV. For the background region, an annulus with an inner radius equal to the source region and an outer radius five times larger was used 2 . In the case of detected pairs, photometry was performed at the source position determined in each observation. In the case of detected/undetected pairs, forced photometry was performed in the observation without a detection at the coordinates of the detected source. Our requirement that detected/undetected pairs have no other sources within 10 ensures that this forced photometry does not include any contribution from a slightly offset detection of the same source or other nearby sources. Fluxes were calculated from the inferred count rates assuming a power law spectral model with a photon index of 1.7, and were corrected for Galactic absorption (Dickey & Lockman 1990). This analysis provided us with robust flux measurements for all sources, determined using a Bayesian method to marginalise over the background uncertainty for each source and takes into account crosstalk between the source and background regions 3 . We use the mode of the posterior probability distribution of the flux as our estimate of the source flux, and in the case where the mode was zero we define the 1σ upper limit on the flux as the value below which 68% of the probability density is contained. Note that in our analysis, these fluxes are only used for selecting subsets of sources; our likelihood calculations make use of the raw measurements of the source and background count rates, areas and exposures for each source.
At this stage we performed some additional filtering of the source list. Sources falling more than 6 off-axis in either observation were rejected to avoid any systematics due to the increasing PSF (the 90% encircled energy fraction of the PSF is < 5 within this off-axis angle for the energies considered). This also eliminates cases where a source might be out of the field of view in one of the two observations. Sources flagged as near chip gaps by srcflux were excluded. We also rejected 51 sources flagged as extended by wavdetect. Finally, if a source was detected in multiple pairs of observations of the same field, only the pair with the longest interval between observations was retained, leaving a sample of 1511 unique sources, of which 767 were detected/undetected pairs.
While we use only a subset of these 1511 sources for our variability analysis, the observed properties of all sources are given in Table 2 (the table displays the first 60 sources and full table is available in the electronic version of the paper). Sources were given a unique identifier of the form O 1 _O 2 _i where O 1 and O 2 are the Chandra observation identifiers for observation 1 and 2 respectively (where the longest observation is defined to be observation 1), and i is an integer indicating the source number in each pair of observations.
The fluxes of the AGN in the two observations (F 1 and F 2 respectively) are plotted in Fig. 2. The sources with upper limits are shown in the centre and right panels. There are more upper limits for F 2 because observation 1 was defined to be the longer of the two. There are fewer upper limits (205 in total) than the 767 detected/undetected pairs as non-zero fluxes are measured by forced photometry in many of those cases.
The fluxes scatter about the line of equality as expected, with the scatter increasing to lower fluxes due to the increased statistical scatter. The variability in the sources is manifested in the intrinsic scatter about this line of equality. Measuring this variability relies on the correct modelling of the statistical uncertainty in the fluxes (which is non-Gaussian as most sources are in the Poisson regime), and the inclusion of upper limits. As described in the next section, this is achieved by expressing the flux variability in terms of the observed photon counts, rather than using the inferred fluxes directly.

THE AGN VARIABILITY MODEL
In this section we will construct a likelihood function relating the characteristic variability of the AGN in a sample to the observed source and background counts. This combines the likelihood associated with the photometric measurements for a pair of fluxes, and the likelihood of a pair of fluxes given the characteristic variability. Expressed in this way, the likelihood function properly accounts for the Poisson nature of the observed counts, which naturally avoids the need to model upper limits on any inferred fluxes.

Photometric measurements
For a given AGN, the key observed quantities are the photon counts in the source and background apertures. We denote these as T 1 , T 2 in the source aperture and B 1 , B 2 in the background aperture. We wish to calculate the likelihood of observing these quantities given a set of model parameters.
We use the terms "source intensity" s and "background intensity" b to refer to the mean number of counts expected in a particular aperture (i.e. the mean of the Poisson distribution from which the observed counts are drawn). These are related to the corresponding fluxes, F and F b respectively, by an energy conversion factor κ, which accounts for the instrument response, source or background spectral shape, and exposure length. This conversion factor is defined such that F = κs, and κ is known for each observation of each AGN.
For a particular observation of a given AGN, the observed counts in the background aperture B is a Poisson realisation of the background intensity b. We can thus write the stochastic relation between the observed counts and background intensity as denoting that B is distributed with a Poisson probability density with mean b.
For the same AGN, the total counts T in the source aperture contain a contribution from the source and background intensities. Thus where r accounts for the difference in detector area A and effective area E between the source and background apertures: In principle, one can also account for the fact that the PSF scatters some of the source photons into the background aperture. However, given that all apertures are defined in the same way to contain 90% of the source flux, and that we are interested in variability rather than absolute flux values, this effect can be neglected (we verified that including this effect made no significant change to our results).
The background intensity can then be marginalised over to give the joint likelihood of T and B for a given AGN: (4) For a pair of observations of the same AGN at different epochs, the measurements are independent and expressing the intensities in terms of fluxes (using F = κs), the joint likelihood is One of the important features of our model is the inclusion of the Poisson likelihood to model the observed counts directly, rather than using derived fluxes and assuming Gaussian statistics. For the main sample of 416 AGN we define in §3.4, the median of the minimum net counts recorded for each AGN in its two observations is ≈ 22 (i.e. for half of the sample, the source has < ∼ 22 net counts in at least one of the two observations). Assuming Gaussian statistics would therefore be a poor approximation for a large fraction of the AGN.

Characteristic variability
The fundamental aim of this work is to determine the characteristic amount by which AGN vary in X-ray brightness between observations separated by months to years. Motivated by this, we model an AGN as having some mean long-term flux F , with some variability σ. We then assume that fluxes averaged over typical observation lengths (10s of ks) measured at epochs separated by months to years are sampled from a lognormal distribution centred on log(F ) with a standard deviation σ. Working in natural log space, σ then represents the characteristic fractional variability of the AGN on the timescales probed. We further assume that the variability of all AGN can be described by similar lognormal distributions, each with a different mean flux, but all sharing the same fractional variability σ. Our aim is then to determine the value of σ, the characteristic fractional Xray variability of AGN between observation epochs. This assumption of log-normality is similar to many previous studies of variability in ensembles of AGN, which have measured the average fractional variability (e.g. Almaini et al. 2000;Mateos et al. 2007) In this model, the probability distribution for a pair of fluxes F 1 , F 2 is where dlnorm is the lognormal probability density function, and σ is the quantity in which we are interested, describing the fractional variability of the AGN. It is known a-priori that AGN fluxes are not uniformly distributed, but instead follow a distribution (generically referred to as a log(N )−log(S) distribution) which can be approximated as a power-law or broken power-law (e.g. Hasinger et al. 1998;Mateos et al. 2008;Lehmer et al. 2012). We therefore write the prior probability density on F as i.e. the source flux (F ) is distributed with a power-law probability density with a negative slope β, and C is a normalisation factor computed to normalise the density to unity over the range of flux considered. The slope of the log(N )−log(S) distribution β is a nuisance parameter in our model, but as we will see later, with a suitable sample definition our constraints on the variability are insensitive to this parameter. The mean flux can then be marginalised over, and combining Equations Eq. 6 and 7 we then obtain  TABLE 2 Properties of the detected sources. ID is a unique identifier of each source of the form O1_O2_I where O1 and O2 are the Chandra observation identifiers for observation 1 and 2 respectively, and I is an integer indicating the source number in each pair of observations. Observation 1 is defined to be the longer of the two observations. For the other columns, subscripts 1 and 2 indicate whether the quantity is measured for observation 1 or 2, and subscripts s and b indicate if the quantity is measured for the source or background region, respectively. "Det" is a binary flag indicating whether a source was detected in a given observation ( upper limits. The "Separation" column gives the angular separation between detections of the same source in the two observations, and ∆t gives the separation in days between the two observations. All photometric quanities are measured in the 0.5 − 2 keV band in the observer's frame. This is a truncated version of the table; the full data for all 1511 sources are presented in the online version of the paper.

The final likelihood
The likelihood of the counts observed in a pair of observations of an AGN, given σ and β can now be written by combining equations 5 and 8 and marginalising out F 1 and F 2 to give P (T 1 , B 1 , T 2 , B 2 |σ, β, κ 1 , r 1 , κ 2 , r 2 ) = P (T 1 , B 1 , T 2 , B 2 |F 1 , κ 1 , r 1 , F 2 , κ 2 , r 2 )× P (F 1 , F 2 |σ, β) dF 1 dF 2 (9) The final likelihood for a sample of N AGN is the product of the individual probabilities: In principle one could multiply this likelihood by priors on σ and β and treat it as a posterior distribution to be sampled with standard Bayesian techniques. However, evaluating this likelihood function for a sample size of a few hundred AGN is computationally expensive due primarily to the Poisson probability evaluations inside nested integrals. This can be mitigated by splitting the sources over multiple CPU cores to evaluate their likelihoods in parallel and combining these for final likelihood. Even so, for a typical fit of ∼ 300 AGN spread across 30 CPU cores (using more cores leads to diminishing returns due to overheads), each likelihood evaluation took around 20 s of wall time. For this reason we adopted a simple maximum-likelihood analysis, and given the model has just two parameters (or one when β is fixed), straightforward grid searches were sufficient to map the likelihood distribution.
Parameter uncertainties were estimated using the likelihood ratio method (e.g. Cash 1979), noting that 2 log L is χ 2 distributed with a number of degrees of freedom equal to the number of model parameters. Thus for a two parameter fit the (1σ, 2σ, 3σ) confidence intervals enclose the parameter values for which 2 log L 0 − 2 log L > (2.3, 6.0, 11.6), where L 0 is the maximum value of the likelihood. For a single parameter fit, the corresponding levels are 2 log L 0 − 2 log L > (0.98, 3.8, 8.8).

Subsample definition and selection biases
The selection of the sample used for this type of ensemble variability analysis can easily result in biases that will increase or decrease the apparent value of σ. With no additional selection applied, our sample of AGN is defined by the requirement that each AGN be detected by wavdetect in at least one of its two observations. In principle this selection function could be modelled with simulations but it is possible instead to define a subsample for which the selection results in an unbiased estimate of σ.
As an illustration, consider samples defined using a limit in the observed flux F lim . One could define a sample of the AGN for which (F 1 > F lim ) OR (F 2 > F lim ). This is illustrated in the left panel of Fig. 3 and as is apparent, this selection results in an overestimate of σ due to the exclusion of AGN with small flux differences near F lim . An alternative selection of (F 1 > F lim ) AND (F 2 > F lim ) is illustrated in the central panel of Fig. 3. This is illustrative of a sample where the source is required to be detected in all observations (for example in a source catalogue). In this case σ will be underestimated due to the exclusion of AGN with large flux differences close to F lim . In both cases, the slope β of the population distribution influences the amount of bias on σ since increasing β increases the density of AGN close to F lim , whence the bias originates.
The selection bias and dependence on β can be avoided entirely by defining a sample with a flux selection that is orthogonal to the line of equality of the two fluxes. Since σ is measured in log space, this selection must also be made in log space, and corresponds to √ F 1 F 2 > F lim (i.e. the geometric mean of the two fluxes is greater than F lim ). This is illustrated in the right panel of Fig 3, and all of the samples used in our analysis are defined in this way.
Even using this geometric mean flux limit, one possible source of bias remains. If F lim were set too low, then the sample would begin to include false positive sources. Including these sources would bias σ high, since they would be associated with a background noise peak in one image and a random background value in the other image. To minimise this, we set the flux limit to F lim = 2.5 × 10 −15 erg s −1 cm −2 . This limits the sample to 416 sources of which 408 were detected with a significance > 3σ in at least one observation (the remaining 8 were all detected at > 2.4σ in at least one observation), leading to a highly pure sample.
When computing the geometric mean fluxes for subsample selections, we used the fluxes measured with srcflux. The mode of the flux posterior was used for most sources, but in the case of upper limits (when the mode is zero), the value of the flux upper limit was used instead. However, no such sources with a flux upper limit exceeded the flux limits used to define the subsamples for our analysis, so none were ultimately included in the variability measurements.

RESULTS
We obtained our main results by considering a subsample of AGN for which the two observations were separated by at least a year, to reduce sensitivity to shortterm variability. This subsample contained 222 AGN, of which 11 were in detected/undetected pairs. We refer to this as the primary sample.
The likelihood in Eq. (9) was computed over the parameter space of σ and β for the primary sample, and the resulting constraints are shown in Fig. 4. The best fitting values were σ = 0.43 ± 0.04 and β = 1.24 ± 0.02.
The constraints on the two parameters are essentially independent with no degeneracy. For this reason, we fix β = 1.24 for all subsequent fits in order to speed up the fitting process. (We will show in §5.1 that this has no impact on the results.) With β fixed, the constraint on σ for one free parameter was σ = 0.43 ± 0.03. The results obtained for this and the subsequent subsamples are summarised in Table 3.
In order to test the effect of the matching radius on the measured variability, we restricted the sample to the 165 AGN in the primary sample with a matching radius of < 0.5 . For this subset, the best fitting variability was σ = 0.42 ± 0.03, fully consistent with the  In all cases, the minimum interval between observations was one year. The first column the allowed range in separation between the sources in the two observations. The second column is the threshold in exposure time that must be exceeded by at least one of the observations. The third column gives the allowed range of the geometric mean flux of the two observations. The fourth column gives the number of AGN in the resulting subsample and the final column gives the constraints on σ. For all fits, β was fixed at a value of 1.24. previous measurement. Meanwhile, for the 65 AGN with matching radii between 0.5 and 1 , the variability was σ = 0.45±0.05. We thus conclude that using a 1 match-ing radius has no impact on our results.

Separation Exposure
Finally, we also investigated if there was evidence for σ being different for lower flux AGN. To do this we selected AGN for which at least one of the two observations had an exposure of at least 85 ks (chosen to give a reasonable number of deep pointings). We then selected sources with geometric mean fluxes of between 1 × 10 −15 erg s −1 cm −2 and 2.5 × 10 −15 to give a low-flux sample of 64 AGN (all but one detected at 3σ) that had no overlap with the primary sample. For this low-flux subsample, the best fitting variability was σ = 0.53 ± 0.07. This is not a very significant difference from the value of σ = 0.43±0.03 measured for the higher flux sources, but could be indicative of a larger variability at lower fluxes.

Variability on different timescales
Next, we measured the variability as a function of the (observer's frame) time difference ∆t between observational epochs. For this we used the same flux selection as the primary sample, but defined five subsamples of AGN with ranges in ∆t chosen to give approximately equal numbers of AGN in each subsample. The subsamples were then modelled as before to determine the characteristic flux variability on those different timescales. The resulting constraints on σ are shown in Table 4 and plotted in Fig. 5. Also shown in Fig. 5 is the trend of variability with rest frame time interval inferred from the structure function analysis of a large sample of AGN  in Vagnetti et al. (2016). This comparison is discussed in §5.3.

Verification of methodology
Our AGN variability model was tested on simulated data in order to verify that the variability could be recovered accurately. A synthetic dataset was generated by sampling a large number of fluxes from a reference log(N ) − log(S) distribution, with each representing the mean flux of a different synthetic source. The variability of each synthetic source was then modelled as a lognormal distribution with the mean flux for that source and with a constant value of σ common to all sources. For each synthetic source, a pair of fluxes were then sampled from its lognormal distribution to represent two observations of that flux. For each pair of fluxes a random real source was chosen from the list of all sources detected in our Chandra data (i.e. prior to any filtering), and the pair of synthetic fluxes were assigned the source and background areas, exposures, effective areas and background counts of the two observations of the real source.
Finally, these properties were used to compute the total intensity in the source aperture for each source, and the total and background counts for each source were then sampled from Poisson distributions with the appropriate rates. This method produced a large number of synthetic observations of a realistic population of AGN with observational characteristics the represent the range of data quality in the real data.
Following this method, mock samples could be generated with different characteristics and different selection functions applied to test our model. In all cases, when the geometric mean flux selection was used, the input variability was recovered accurately. For example, we generated a population of AGN following the 0.5−2 keV, broken power-law log(N )−log(S) distribution of Lehmer et al. (2012), with slopes β 1 = 1.49 and β 2 = 2.48 at fluxes below and above 6 × 10 −15 erg s −1 cm −2 respectively. The geometric mean flux limit of √ F 1 F 2 > 2.5 × 10 −15 erg s −1 cm −2 was then applied to this synthetic sample and a random subset of 1000 pairs of synthetic observations was selected. We then modelled this using the methodology used for our main results, with β fixed at 1.24, and fitting only for σ. For an input variability of σ = 0.5 our method recovered σ = 0.50±0.02. This demonstrates that our method is robust, and due to the geometric mean selection, is insensitive to the modelling of the log(N ) − log(S) distribution.
In the cases where the OR or AND sample selections discussed above were applied to the synthetic data, the recovered σ was biased by ≈ 15% high and low respectively.

Evaluation of our model
In the previous section we demonstrated that our model can accurately recover the true variability of realistic synthetic data. In this section we assess how well our model describes the observed data. To do this, we computed the ratio of the observed fluxes for each source in the primary sample (F 1 /F 2 ). These are plotted in Fig.  6, and form the basis of a quantitative comparison with our model. According to our model, the two observations of an AGN have fluxes F 1 , F 2 that are drawn from a lognormal distribution with a given σ (σ = 0.43 in the case of the primary sample). If the mean, log(F ), of the distribution were known, then the ratio of F 1 /F or F 2 /F would be lognormally distributed with a mean of one and standard deviation σ. However, the ratio of pairs of independent measurements F 2 /F 1 will be lognormally distributed with a mean of one but with a standard deviation of √ 2σ. As expanded upon in the next section, the √ 2 factor arises as F 1 and F 2 are both independent samples from the flux distribution. This model is plotted in Fig. 6, and appears to be a very good description of the data.
However, with this analytic form we neglect the photon counting noise on the individual flux measures, which would broaden out the distribution of flux ratios compared to the lognormal model. To model this effect we generated 10 6 synthetic pairs of observations of AGN following the method of §5.1, but using the best fitting model to our primary subsample (i.e. σ = 0.43, and a single power-law log(N ) − log(S) with β = 1.24).  Gehrels (1986). The solid black line shows the distribution of flux ratios derived from a synthetic sample of sources generated from our best fitting model. The dashed red line and shaded 1σ error envelope show the lognormal distribution with the best fitting variability of σ = 0.43 ± 0.03 multiplied by a factor of √ 2. As discussed in the text, this factor accounts for the uncertainty on the mean of a pair of observations. The small difference between the two curves is due to the inclusion of statistical noise and sample selection effects in the curve derived from the synthetic sample.
The geometric mean flux limit of √ F 1 F 2 > 2.5 × 10 −15 erg s −1 cm −2 was then applied to this synthetic data to match the primary sample selection, and the ratios of F 1 /F 2 were computed for each AGN.
The distribution of flux ratios predicted by the best fitting model is plotted in Fig. 6. As expected, this is slightly broadened compared with the intrinsic lognormal distribution. Visually there is a good agreement with the observed flux ratios in this binned representation. The unbinned distribution of observed flux ratios was compared with that of the synthetic sample from the best fitting model using a two-sided Kolmogorov-Smirnov test. This gave a p value of 0.74, meaning that the data cannot rule out the null hypothesis that both the observed and synthetic samples are drawn from the same parent distribution. We thus conclude that our model is a good description of the data.

Comparison with other work
Our analysis is quite distinct from previous work on the long term variability of AGN, in that it is not focussed at understanding the intrinsic properties of the AGN, but instead on equipping the observer with the knowledge needed to mitigate their impact on other targets of interest. As such we work exclusively in the observer's frame rather than the rest frame of the source, and do not separate out AGN into samples by redshift or luminosity.
Our analysis method has some advantages over other work. Firstly, we define a sample selection that avoids the biases introduced when applying an independent selection threshold to each observation. Secondly, we use the Poisson likelihood of the observed counts, which removes the approximation of Gaussian statistics and naturally includes non-detections and upper limits.
In the literature, two main approaches have been used to measure the characteristic variability of samples of AGN: the normalised excess variance (NXS) and the structure function (SF). Both quantities are defined precisely in Vagnetti et al. (2016) but they can be described as follows. The NXS is the variance of the flux distribution of a particular source once measurement errors have been subtracted, normalised to the mean flux of that source. The SF is the root mean square of the difference in log flux between pairs of observations, where the average is taken over sources with approximately the same time interval between observations, and the average statistical noise is subtracted. Almaini et al. (2000) used a variation on the NXS method to measure the variability in 86 quasars on timescales of up to 14 days. Their method is not directly comparable as they utilised 16-26 flux measurements per source, spread over the 14 day period, but the average variability of ≈ 20% that they found is similar to the value we find for the shortest time intervals we sampled. Mateos et al. (2007) used the same approach as Almaini et al. (2000) to measure variability in AGN observed in 16 observations of the Lockman Hole field with XMM-Newton spread over 2 years. The average fractional variability for that sample was 0.22 ± 0.01, which is smaller than the values of ≈ 0.4 that we find on timescales longer than about a year. However, once again a direct comparison is difficult as the Mateos et al. (2007) measurement comes from multiple fluxes spread over 2 years while ours comes from flux pairs separated by a given interval. The mean interval between observations for the Mateos et al. (2007) value was about 40 days, so a better comparison may be with the ≈ 25% variability we find on that timescale. Vagnetti et al. (2016) investigated the variability in a sample of 2700 AGN observed multiple times with XMM-Newton, and with known redshifts using both the NXS and SF methods. Their SF measurements are quite comparable with our measurement of characteristic variability as they are determined from pairs of fluxes separated by ∆t, and we thus convert their SF to a fractional variability using their equation 6.
The resulting trend in fractional variability with observation interval for the whole Vagnetti et al. (2016) sample is shown in Fig. 5, and the agreement with our measurements is very good. This is despite some significant differences between these works. In particular, the sample definition of Vagnetti et al. (2016) requires the sources to be detected in all observations, which should lead to an underestimate of the true variability, while our analysis uses the observation interval in the observer's frame. The latter effect would blur out pairs of observations with the same rest frame interval into a range of observer's frame intervals, flattening the slope of any trend between σ and ∆t. The good agreement between the two sets of results suggests that neither of these effects are large.
Overall, we conclude that while previous measurements using the NXS method are not directly comparable, the agreement with our results seems reasonable.
The closest comparison is with the SF measurements of Vagnetti et al. (2016), where the agreement is very good.

Forecasting fluxes
We are now in a position to answer the question posed in this paper. Given a measurement of the X-ray flux F 1 of an AGN at one epoch, and with no additional knowledge of its properties, what is our best estimate of the flux F 2 at a second epoch some years earlier or later? Under these circumstances, neglecting Eddington bias (see below), the flux is equally likely to be higher or lower in the second observation, so our best estimate must be F 2 = F 1 , but the uncertainty on this depends on the variability of the source. (Of course the uncertainty also depends on the statistical noise on each observation, but we disregard that here to focus on the irreducible uncertainty due to variability.) Consider an AGN that has a lognormal variability of its flux about a mean flux log(F ), with standard deviation σ. In this case, given N observations of the source with fluxes F i , we estimate log(F ) as the mean of the log(F i ) with a precision of σ/ √ N . In other words, the probability density for F is Now, to predict the flux F of this source at the epoch of another observation, we have to marginalise out the unknown mean F for which we know the posterior from the previous N measurements of the flux.
where both of the probabilities on the right hand side are lognormal. This results in a posterior for F that is also lognormal, with standard deviation Thus, for a source with one previous flux measurement F 1 we predict the flux at a second epoch to be F 2 = F 1 with an uncertainty on log(F 2 ) of σ = √ 2σ, as above. For example, based on the σ = 0.43 we found for the primary sample, for observations separated by about 1 − 10 years the uncertainty on a flux prediction based on a single previous measurement is 0.43 × √ 2 or approximately 60%.
In the limit of a large number of previous flux measurements, the uncertainty on the log of the flux at a new epoch tends to σ. In principle this sets the average population variability, σ as the limiting precision of any flux forecast. However, with many flux measurements of the same source, the variability of that source could be constrained, resulting in a more accurate prediction of its flux at another epoch (with a precision limited by the variability of that particular source). As illustrated in 5, further improvements can be gained by scheduling observations with the shortest possible intervals between them to reduce the overall variability.
Two further effects should also be considered when making flux predictions. Firstly, if an AGN is discovered in an observation at some epoch, there is an Eddington bias effect present. Given the greater number of AGN at low fluxes (as described by the log(N ) − log(S) distribution), a newly discovered AGN is more likely to be a low flux AGN in a high flux state than vice-versa. We would thus expect the AGN to be more likely to have a lower flux at another epoch, regressing to the mean. This can be modelled by including the population distribution P (F ) in equation 12: (14) where P (F ) could be e.g. a single or double power law. Secondly, in order to forecast the observed value of F , it is further necessary to model the statistical noise on the observation, using knowledge of the appropriate instrumental and observational characteristics.
We investigated how well our model predicted F 2 given F 1 for our primary sample. Neglecting the effects of statistical noise and Eddington bias, our best estimate of the flux at the second epoch is where the uncertainty is a lognormal distribution. We then calculated the percentage of sources for which F 2 fell within a given central percentile of this lognormal distribution of F 1 . For example, if the model gives good predictions then we expect that about 68% of the time, the predicted F 2 values will be within the central 68th percentile of a lognormal distribution centred on F 1 (i.e. within the 1σ error on the forecast). The resulting forecasts are shown in Fig. 7, which shows that the precision on the flux forecast is good to within 5 percentage points for all confidence intervals.
Our formalism and results are applicable for the scenario in which the redshift of the AGN is unknown. If the redshift and hence luminosity is known, then the accuracy with which the flux at a second epoch can be predicted is significantly improved. This is because the variability in X-ray luminosity of AGN is a function of luminosity, such that lower-luminosity AGN show larger fractional variability (e.g Vagnetti et al. 2016). For example, Vagnetti et al. (2016) find the fractional variability for observations separated by ∆t = 1000 days (rest frame) to range from σ ≈ 0.55 for their least luminous AGNs (10 43 erg s −1 − 10 43.5 erg s −1 in the 0.5 − 4.5 keV band) down to σ ≈ 0.25 for their most luminous AGNs (10 45 erg s −1 − 10 45.5 erg s −1 in the 0.5 − 4.5 keV band). This range in luminosity variability is naturally included in the average flux variability that we measure. However, if the redshift and hence luminosity of an AGN were known, then the results of Vagnetti et al. (2016), or similar studies, could be used to estimate a more accurate variability for that particular AGN.

Applications
This measurement of the characteristic X-ray variability of AGN has important consequences for scenarios when the flux of an AGN at one epoch must be inferred from its flux at a second. Given the uniquely high angular resolution of Chandra compared to other X-ray observatories, the most common such scenario will continue to be the use of Chandra to constrain point source contributions to observations made with other observatories.
A key example is the use of Chandra to support deep XMM-Newton observations of distant galaxy clusters, where XMM-Newton cannot resolve out the emission from projected AGN, biasing measurements of the temperature and luminosity of the cluster (e.g. Hilton et al. 2010). As demonstrated by Hilton et al. (2010), the effect of such contamination can be mitigated by jointly modelling Chandra data to constrain the AGN flux. In doing this, the uncertainty on the flux of the source due to its variability between epochs should be included. For example, if the interval between observations were a year or more, then the uncertainty on the flux prediction based on a single measurement is ≈ 60%. This could be modelled with a lognormal prior with σ = 0.6 on the flux (or normalisation of the relevant model component) of the AGN at the epoch of the XMM-Newton observation, centred on the flux measured with Chandra. This approach will also be relevant for observations of distant clusters with ATHENA, whose angular resolution will be poorer than that of Chandra. Where possible, the interval between observations should be minimised to reduce the average variability. The same approach can be used in studies of cluster outskirts, which may optimally combine Chandra measurements of the point source population with Suzaku's detection of the ICM (e.g. Thölken et al. 2016).
The presence of AGN can also bias the detection of galaxy clusters in X-ray surveys such as XXL (Pierre et al. 2016), XCS (Mehrtens et al. 2012) or the upcoming eROSITA survey (Pillepich et al. 2012;Merloni et al. 2012;Pillepich et al. 2018;Clerc et al. 2018). In all cases the relatively poor angular resolution of the survey data may lead to AGN being misclassified as clusters, or boost the detection probability of clusters by enhancing their surface brightness (clusters with projected AGN may also be misclassified as pure AGN and missed by these cluster surveys). The purity of such surveys can be estimated by short Chandra observations of a subset of clusters to detect the presence of contaminating AGN (Logan et. al., 2018, submitted). The variability between epochs should then be included when considering the contaminating flux (or upper limit thereon) determined from the Chandra data.
A further application of our constraints on X-ray variability is to inform the modelling of stray light in X-ray observations due to bright point sources outside the field of view. Stray light from such sources can produce faint, inhomogeneous structure in X-ray images that could bias studies of low surface brightness emission. A model of the stray light component could be produced to mitigate this, given an accurate model of the X-ray optics and knowledge of the fluxes and positions of sources outside the field of view. The latter could come from all-sky Xray survey data. However, in the case that the stray light signal is dominated by a few bright AGN, their variability between the epoch(s) of the survey data providing their fluxes and the observation for which the stray light must be modelled will provide an irreducible limit on the precision of the stray light modelling.

SUMMARY AND CONCLUSIONS
We have used pairs of Chandra observations of a large number of AGN to infer the characteristic X-ray variability of the population on different timescales. We developed a sample selection method that is insensitive to biases, and a likelihood model that was able to precisely recover the variability in realistic simulated data.
For our primary sample of sources observed between around 1 and 15 years apart, the variability of the population is well described by a lognormal distribution with standard deviation of σ = 0.43 ± 0.03. We find evidence, in common with other work, that the variability is smaller on shorter timescales, with σ ≈ 0.25 for separations of about one month to one year between observations.
Given a single flux measurement, the best estimate of the flux at a second epoch is the same as that at the first epoch (neglecting Eddington bias) with a (lognormal) 68% confidence interval of √ 2σ. The factor √ 2 arises due to the uncertainty on the mean flux for the source given just one previous measurement.
As a rule of thumb, given the flux of an AGN at one epoch, one can estimate its flux at a second epoch (more than a year or so earlier or later) to a precision of about 60%.
This result has applications in a wide range of scenarios where X-ray fluxes of AGN must be inferred from their values at a different epoch, and presents a significant source of irreducible uncertainty that should be taken into account. A useful next step would be to better constrain the dependence of the variability on interval between measurements, including longer time intervals then those probed here as data become available.