Propagating residual biases in masked cosmic shear power spectra

In this paper we derive a full expression for the propagation of weak lensing shape measurement biases into cosmic shear power spectra including the effect of missing data. We show using simulations that terms higher than first order in bias parameters can be ignored and the impact of biases can be captured by terms dependent only on the mean of the multiplicative bias field. We identify that the B-mode power contains information on the multiplicative bias. We find that without priors on the residual multiplicative bias $\delta m$ and stochastic ellipticity variance $\sigma_e$ that constraints on the amplitude of the cosmic shear power spectrum are completely degenerate, and that when applying priors the constrained amplitude $A$ is slightly biased low via a classic marginalisation paradox. Using all-sky Gaussian random field simulations we find that the combination of $(1+2\delta m)A$ is unbiased for a joint EE and BB power spectrum likelihood if the error and mean (precision and accuracy) of the stochastic ellipticity variance is known to better than $\sigma(\sigma_e)\leq 0.05$ and $\Delta\sigma_e\leq 0.01$, or the multiplicative bias is known to better than $\sigma(m)\leq 0.07$ and $\Delta m\leq 0.01$.


INTRODUCTION
Measurements of the weak lensing effect can be subject to biases caused by inaccuracies in the algorithmic methods used to determine a galaxy's shape (Heymans et al. 2006;Massey et al. 2007;Bridle et al. 2010;Kitching et al. 2012;Mandelbaum et al. 2015), measure the point spread function of a system (Hoekstra et al. 2017b;Kannawadi et al. 2019), determine detector effects (Antilogus et al. 2014), or detect galaxies (Hoekstra et al. 2015).
The treatment of such biases in cosmic shear power spectra is a topic that has been dealt with in several papers, for example Amara & Réfrégier (2008); Massey et al. (2013); Cropper et al. (2013); Kitching et al. (2019). However a full propagation of biases into measured (observed) power spectra in the presence of survey masks (where a portion of the sky is unobserved) has not been done. In this paper we build on the work of Kitching et al. (2019) and include the impact of survey masks.
In Section 2 we present the formalism for propagation of biases in the presence of masks, and identify power spectrum combinations that are dependent to varying degrees on real and imaginary bias terms; in Section 3 we test our formalism on simulations; we discuss conclusions in Section 4.

METHOD
In the following we expand upon the derivations given in Kitching et al. (2019). We can relate a measured shear in real (angular) space to the true shear -that would have been measured in the absence of systematic effects or a mask -via multiplicative and additive fields that describe respective biases that may be introduced where all quantities are a function of angular coordinates = ( , ), with and being latitude and longitude (or R.A. and dec).
( ) is a spin-0 mask where ( ) = 1 where data exists and ( ) = 0 where there is no data (we note that an optimal weight could in principle be computed). ( ) is the true spin-2 shear, ( ) is the measured spin-2 shear, and ( ) is the measured spin-2 shear in the absence of a mask. 0 ( ) = 0 ( ) + i 0 ( ) is a position-dependent multiplicative bias term that includes a possible systematic rotation (see Kitching et al. 2019, Appendix A), 4 ( ) = 4 ( ) + i 4 ( ) is a possible spin-4 multiplicative bias term , and ( ) is a spin-2 position-dependent additive bias. * is a complex conjugate.
The spherical harmonic coefficients for the E-mode (curl-free) and B-mode (divergence-free) parts of the shear field can be determined via where ℓ ( ) are spin-weighted spherical harmonics (with spin = 2 or −2), ℓ and are angular wavenumbers; note that we F . 1.-The real part of the multiplicative bias field ( ), in the three simulated cases investigated. Cases 1, 2, 3 are shown left to right: simple galactic case, simple patch pattern and simple scanning pattern (see Section 3. Shown is a simulated celestial sphere, using a Mollweide Projection with = = 0 at the North pole. The colour bar shows the amplitude of the biases, and the black regions show the regions where no data is present (i.e. the mask). use as one of the spherical harmonic wavenumbers and ( ) as multiplicative biases to follow convention but these should not be confused.
As shown in Kitching et al. (2019), and including the spin-4 terms, in the absence of any mask (i.e. ( ) = 1 ∀ ) the true and measured shear field's spherical harmonic coefficients can be written like where we expand the spin-2 quantities like ( ) = ℓ ℓ 2 ℓ ( ). Throughout we use superscript and subscript labels, but we note that the position of the labels relative to the main symbol is not significant i.e. they are just labels. We note that the multiplicative weight factors for the -mode parts only depend on the sum of the multiplicative bias terms. The weight functions are given by for = 0 or 4 . In a similar way the impact of the mask can be related to the measurement in the absence of a mask using a standard pseudoℓ expression (Lewis et al. 2002;Zaldarriaga & Seljak 1997;Grain et al. 2012;Brown et al. 2005) The weight functions within the sums represent the mode-mixing caused by the mask and are given by note that for the spin-0 mask the additional imaginary terms do not exist, but that all of these quantities are complex. These quantities in equations (4) and (6) are formed from combinations of integrals on the sphere over the mask or multiplicative bias field multiplied by spin-weighted spherical harmonic functions and are given by where the function on the sphere ( ) in the integrand is labelled in the superscript.
By combining equations (3) and (5) we can find an expression that includes the effect of both biases and a mask By comparing equations (3) and (8) it can already be seen that the presence of a mask causes additional E and B-mode terms to occur in the power spectra.
To simplify these expressions we note that the true BB field ℓ ℓ . Schneider et al. (2002) show that source redshift clustering can cause a small -mode component, approximately three orders of magnitude less than the -mode component over scales with ℓ ≤ 5000. Therefore the terms that contain multiplicative bias terms combined with the -mode should be small i.e. we set terms O ( ) = 0, but the unaffected -mode component may be non-negligible. In this case these expressions simplify to We combine the weight factors for the 0 and 4 terms and note that in this expression the real and imaginary parts of these fields propagate as sums i.e. one can write a total multiplicative bias field like ( ) = 0 ( ) + 4 ( ) and ( ) = 0 ( ) + 4 ( ).
2.1. Power Spectra The power spectra estimates for the measured shear can now be computed by taking the correlation of the spherical harmonic coefficients from equation (9) where for = ( , ) and = ( , ).
We will assume that the true and power spectra are zero ℓ = ℓ = 0, which should be the case in all but the most exotic dark energy models that cause parity-violating modes (Amendola et al. 2013). Given this assumption, the estimated power spectra is given by The various terms in the full expression are where = (+, −), = (+, −), = (+, −), = (+, −). Power spectra in equation (11) are labelled in their superscripts e.g. , , or , , for additive bias terms. On the left hand side of equation (11) we define the measured power spectrum and compare this to the power spectrum that would have been measured in the absence of systematic effects. We note however that terms in the window functions ( 0 + 4 ℓℓ and ℓℓ ) are derived via the ensemble-average of equation (10), and make use of the statistical rotational invariance of the ensemble-averaged harmonic modes. Therefore wquation (11) is a hybrid of ensemble-averaged terms and terms that are not averaged which may be non-zero only for a given realisation. This is tested numerically in Section 3. We do not present the EB and BB equivalents here since, as demonstrated later in the paper a linear decoupled-field expression is sufficient to characterise the impact of biases.

Linear Decoupled expressions
Here we simplify the analysis by exploring two assumptions. The first is a linearity assumption that terms of order 2 , 2 or and higher are negligible. The second is a decoupled assumption that the spherical harmonic transform of the mask has no correlation with the spherical harmonic transform of the multiplicative bias field.
In comparison with Kitching et al. (2019) we can identify the terms to be similar to the linear multiplicative bias terms in that paper, which were shown to only depend on the mean of the multiplicative bias field. In the case of masks these linear terms do not in general reduce to the mean of the multiplicative bias field since the mask may be coupled to the multiplicative bias field; this is something we numerically investigate in Section 3. However, if the multiplicative bias field is constant and/or not strongly coupled to the mask, then where ( ) = 0 ( ) + 4 ( ) and ( ) = 0 ( ) + 4 ( ) are the mean of the real and imaginary parts of the sum of the multiplicative bias fields respectively.
Assuming no coupling and to linear order in biases we find that where we assume that ℓ = ℓ and similar for other times (i.e. a non-tomographic case), and we note that M +− ℓℓ + M −+ ℓℓ = 0.

Unlensed random ellipticity contribution
To include the effect of the stochastic ellipticity field (i.e. the random uncorrelated and unlensed ellipticities of galaxies) in the expressions above we add a term to the true shear ( ) + ( ). We note that the presence of a multiplicative bias in the measurement will affect the observed stochastic ellipticity component here ( ) is the true underlying uncorrelated galaxy ellipticity. This contribution is the zero-lag intrinsic ellipticity field (Crittenden et al. 2001;Larsen & Challinor 2016;Blazek et al. 2015), which in the case of a finite number of galaxies is expressed as a shot noise term; see Blazek et al. (2019) for a discussion.
The shot noise component of the uncorrelated ellipticity term for a finite number of galaxies in a sample has the properties that where 2 is the observed variance of the ellipticities, gal is the effective number of galaxies in the observations (for a discussion of the effective number density see Blazek et al. 2019;Chang et al. 2013), and and are ( , ). We note that any additional noise caused by the measurement process itself (e.g. sky noise, detector noise etc.) is already captured in a stochastic contribution to the ( ) term.
In this case we have a general expression that is We note that the noise term adding to the BB part is multiplied by (1 + 2 ( ) ), but recall that we have assumed that terms that contain multiplicative bias terms combined with the true -mode should be small i.e. O ( ℓ ) = 0. 2.4. Power spectrum combinations Equation (17) is the most general case, however to simplify further one can make several reasonable assumptions. The first is that there is no true BB field ℓ = 0 which should be a good approximation; however we reiterate that Schneider et al. (2002) show that source redshift clustering can cause a small -mode component. The second is that the correlation between the additive bias and the shear field is small, which given that the majority of additive biases have a source in instrumental or optical effects, is a reasonable assumption.
We apply these approximations to the EE and BB cases, and we take some combinations of power spectra to highlight the inter-relationships between them, where we have chosen the combinations that highlight the interrelations clearly. We note that in the case that there is no mask There are many combinations of cosmic shear power spectra that can be made each of which will depend on the multiplicative bias and the stochastic variance of the ellipticity field in a different way, both of which are unknown quantities. The most commonly used approached is to use the EE only power spectrum and use the BB power as a consistency test of the level of systematic effects in the data. However, as we have shown the BB power contains information on the multiplicative bias via the observed stochastic ellipticity component. Therefore one can construct a joint EE and BB likelihood where the likelihoods of the EE and BB cases would be summed to form a combined likelihood. A third approach is to subtract the BB from the EE power to form EE-BB which will be dependent on but not . We summarise these in Table 1. These combinations are applicable even after deconvolving the mask (mask deconvolution is a separate point compared to the fact that BB power provides information on the multiplicative bias).
Since these statistics depend on unknown parameters and these parameters need to be marginalised over, and the degeneracy with cosmological parameters will vary between the statistics. We note that marginalisation will always need to be performed in a final likelihood analysis since at best calibration simulations will provide calibration of with some uncertainty. To mitigate degeneracies, and as may be available from previous simulation/calibration data, one should apply a prior to these parameters. In Appendix A we show that the estimation of a cosmic shear amplitude will be biased by imposing a prior on . We investigate these degeneracies and the impact of priors numerically in Section 3.2. F . 2.-The fractional difference between the residual power spectrum calculated analytically ℓ and that found using a forward model ℓ for the three cases considered. The fractional error is with respect to the input cosmic shear power spectrum. For the analytic case we compute the full expression (using equation 11), and using the linear decoupled approximation (equation 14). We plot the cosmic variance error on the cosmic shear power spectrum for comparison. In upper panels we include both calculations for = 64 (limited due to the complexity of the calculations in the full expression), in the lower panels we include only the linear decoupled approximation for = 2048.

Statistic
Observables Throughout we do not attempt to estimate the true power spectrum via inversion of the mixing matrices. This is because when a large fraction of the sky is masked some modes are not observable (i.e. they are in the mask) leading to singular mixing matrices. The standard approach to mitigating this effect is to use band-powers, but such an approach is leads to a loss of information (Hivon et al. 2002).

SIMPLE SIMULATIONS
In this Section we use simple simulations to test whether a linear multiplicative bias assumption is applicable, i.e. that higher order terms O ( 2 ) can be ignored, and whether the linear decoupled assumptions are reasonable (equation 14); and also to investigate marginalisation over an unknown residual multiplicative bias and ellipticity variance.
We use the same extreme multiplicative shear fields used to test the full-sky formalism in Kitching  consider are shown below. Note that we express these in terms of an arbitrary amplitude since these are all normalised to have ( ) = 2 × 10 −3 . The cases are: We use a mask that removes data from less than 20 • in both the galactic and ecliptic planes; and also 20% of pixels at random, to represent an all sky-like mask with random patches removed -this gives a total observed sky fraction of sky = 0.4. We show the masked bias fields in Figure 1. We compute the original ( ) field using a Gaussian random field using a Planck ΛCDM cosmology (Planck Collaboration et al. 2018). The theoretical EE power spectrum, subject to the Limber (Limber 1953;Kitching et al. 2017;Lemos et al. 2017), flat-sky (Kamionkowski et al. 1998), flat-universe (Taylor et al. 2018), prefactor-unity (Kitching et al. 2017) and reduced shear (Deshpande & Kitching 2020) approximations is: where is the comoving distance, H is the comoving distance to the horizon, is the matter power spectrum, and is the lensing kernel: where Ω M is the present-day dimensionless total matter density of the Universe, 0 is the Hubble constant, is the speed of light in a vacuum, is the scale factor of the Universe, and ( ) is the galaxy distribution function of the survey. In this work, we use the photometric DES Year 1 galaxy distribution (Abbott et al. 2018). The matter power spectrum is calculated using the publicly available CAMB cosmology package (Lewis et al. 2000), for the Planck ΛCDM cosmology (Planck Collaboration et al. 2018). We include the corrections from Mead et al. (2015) for the non-linear corrections in the matter power spectrum. In these calculations, the comoving distance at a given redshift is determined using the astropy package (Astropy Collaboration et al. 2018.

Linear decoupled approximation test
Here we use the simulations to test the linear decoupled approximation of equation (14) compared to the full expression in equation (12). In these simulated tests we use a maximum multipole of = 64 when calculating this full expressions, this is limited by the complexity of computing the and terms in equation (12) that scale like 6 and since we are testing the linear decoupled approximation we cannot use the numerical advantages described in Brown et al. (2005). We also compare difference between the measured change in EE power spectrum computed using a forward model and the analytic predictions using the linear decoupled approximation alone in which case we can use a higher maximum multipole of = 2048.
In Figure 2 we show the difference between the measured EE power spectrum (computed using a forward model) and the analytic predictions using the full calculation and the linear decoupled approximation. When forward modelling we create measured shear data using equation (1) and then compute measured power spectra via equations (3) and (10). We find that in all cases the We use the massmappy code (Wallis et al. 2017), SSHT McEwen et al. (2013, and sample the sphere using the sampling scheme of McEwen & Wiaux (2011). Data available at http://desdr-server.ncsa.illinois.edu/despublic/y1a1_files/redshift_bins/ F . 4.-The bias on an inferred amplitude of the amplitude parameter where the true EE power is ℓ as a function of the width of a prior distribution on the multiplicative bias (left for = 0, centre for = 0.002, and right for = 0.05). We use the Gaussian random field simulations described in Section 3. We show results for the joint EE and BB analysis (blue), EE Only (red), and EE-BB (green). The we show 1-error bars on for the joint EE and BB analysis only, these are similar for the EE-BB and EE Only points. The EE-BB is in some cases biased low by more than the plot axes, which we truncate to highlight the small remaining biases for the EE and BB, and EE Only analyses. difference between the analytic expressions and the forward model is at least three to four orders of magnitude smaller than the cosmic variance error, given by (Weinberg 2008 where Δℓ is any bandwidth in ℓ-modes used, and sky is the fraction of the sky observed. We also find that the difference between the full expression (equation 11), and using the linear decoupled approximation (equation 14), is negligible over the tested range compared to cosmic variance terms and for most modes the predictions are indistinguishable.
3.2. Multiplicative bias tests As described in Section 2.4 one can either use the EE only power, or combine the likelihood of the EE and BB power to gain additional information on the multiplicative bias.
Here we test and compare these approaches on Gaussian random field simulations, we use an all-sky survey with a maximum ℓ-mode of = 2048 and use 20 logarithmic spaced bins between [2, ]. For the shear field we use the Planck cosmology used in the previous section, and scale the input power spectrum with an amplitude ℓ with a fiducial value of = 1. For the noise field we assume = 0.3 and gal = 148510660 0 with 0 = 30 galaxies per square arcminute as a fiducial case. After creating a Gaussian random field we then include a constant multiplicative bias, which needs to be marginalised or removed from the inference. The free parameters are ( , , ), and in all cases we assume a Gaussian likelihood. We will show results of estimating the parameters from the Gaussian random field simulations; we use emcee (Foreman-Mackey et al. 2013;Foreman-Mackey 2016) for the parameter estimation and use 20,000 samples in each test (removing the first 100 points and using 32 walkers), we assume uniform prior ranges of −1 ≤ ≤ 1, 0 ≤ ≤ 2 and 0 ≤ ≤ 1 except where otherwise stated.
In Figure 3 we show the constraints when only the flat priors are used on and . In this case we find that the all three parameters are completely degenerate and no meaningful constraint on the amplitude is possible. Therefore a constraint on the cosmic shear amplitude is only possible with either a prior on , a prior on or both. If the prior on or is too large, or centred on the incorrect value, then the constraints on will be biased and the error bar larger.
In Appendix A we show that in general the marginalisation over will result in biases on the inferred amplitude of , caused by a classical marginalisation paradox, but that the total amplitude of the power spectrum (1 + 2 ) should be unbiased, where is a residual bias that is consistent with zero. In Figure 4 we demonstrate this by applying priors to for the three statistics we investigate (we only apply the uniform prior on and in this case). We test this for the cases that the true value of = 0, = 0.002 and = 0.05, and vary ( ) with the prior centred on the true value. This leads to asymptotic estimates of for small ( ), however these are biased low due to the marginal distribution of being biased, and this bias is larger for smaller values as shown in Appendix A. We find that the EE-BB is affected more than the EE only and joint EE and BB likelihood, because in this case there is no additional information from the stochastic ellipticity term.
To avoid the biases in the marginal distribution of we can instead characterise the total amplitude using (1 + 2 ) where is a residual bias. This estimator should be an unbiased for if is zero. To estimate the residual bias from two-point statistics one can do this in two ways • Measure this from simulations, such that → 0 with some uncertainty ( ), • Infer from the BB power spectrum with a sufficiently good prior on . In this case it is better to have no prior on (which may bias any inference of (1 + 2 ) ), and we only need to characterise a prior on , or going beyond two-point statistics to include higher-order or point-estimate terms may also help to lift the degeneracies. We ) as a function of the width of a prior distribution on the stochastic ellipticity variance bias ( ) (left for = 0.002, and centre for = 0.05) and Δ = 0 (right for = 0.05). We use the Gaussian random field simulations described in Section 3. We show results for the joint EE and BB analysis (blue) and EE Only (red). The fainter error bars are the 1-error on (1 + 2 ) for the joint EE and BB analysis.
F . 6.-The bias on an inferred amplitude compared to the estimator (1 + 2[ − ]) as a function of the width of a prior distribution on the stochastic ellipticity variance bias ( ) (left for = 0.002, and centre for = 0.05) and Δ = 0 (right for = 0.05). We use the Gaussian random field simulations described in Section 3. We show results for the joint EE and BB analysis (blue) and EE Only (red). The fainter error bars are the 1-error on (1 + 2 ) for the joint EE and BB analysis. therefore construct an estimator for that is˜ where is the mean measured either from simulations or inferred from the BB power. This should be unbiased by the effect of marginalisation if there is a good estimate of . In Figure 5 we show the bias on the amplitude as a function of the width and centre of the prior on and find that indeed the bias is consistent with zero when ( ) ≤ 0.05 and Δ ≤ 0.01. We also find that this is not dependent on the overall amplitude of the bias. In Figure 6 we show the bias on the amplitude as a function of the width and centre of the prior on and also find that the bias is consistent with zero when ( ) ≤ 0.07 and Δ ≤ 0.01. We also find that this is not dependent on the overall amplitude of the bias.

CONCLUSIONS
In this paper we have extended previous work to write down an expression for the propagation of multiplicative and additive weak lensing biases into cosmic shear power spectra. This expression includes terms that couple the multiplicative bias field and the survey mask, which in principle cause scale-dependent behaviour that is linear in multiplicative bias. By testing on simulations, which include some extreme cases of multiplicative bias fields, we find that the two assumptions of using only linear terms in multiplicative bias, and assuming no coupling between the bias field and the mask, are sufficient to capture any impact of multiplicative biases on cosmic shear power spectra for low-ℓ modes.
In deriving this result we find several combinations of power spectra that are dependent on biases to varying degrees, and we identify that the BB power is sensitive to the multiplicative bias via the stochastic ellipticity field. We find that without prior information on either the multiplicative bias or the variance of the stochastic ellipticity that measurement of the amplitude of the cosmic shear power spectrum is completely degenerate. When applying priors to the multiplicative bias we find that this biases any inference of the amplitude parameters. However we find that the combination of (1 + 2 ) is unbiased for a joint EE and BB likelihood if the stochastic ellipticity variance is known to better than ( ) ≤ 0.05 and Δ ≤ 0.01 or the multiplicative bias is known better than ( ) ≤ 0.07 and Δ ≤ 0.01. This will be generalised to a tomographic analysis and the assessment of the bias on cosmological parameters in future work. where¯is the mean of the prior and the uncertainty. The marginalised distribution of and are then given by In the Gaussian case the marginalised distribution is given by where¯is the mean of with some error , and we have imposed a prior on with mean¯and error . In this case the marginalised distribution of the unconstrained parameter is where ( , , ) = 2 2 + 2 . We refer to equations (A5) and (A6) as the z-Gaussian and inverse z-Gaussian distributions respectively (although they are not technically related via an inverse relation). Both of these are described by four free parameters (¯, ,¯, ). In Figure 7 we show some examples of the z-Gaussian and inverse z-Gaussian distributions for (¯= 1, = 0.1, = 0.1). We also show in Figure 7 the difference between the median and the mode/maximum of the distributions, for various values of (keeping¯= 1 and = 0.1), where in general the mean and medians are skewed to larger values than the mode, which is much more pronounced for the inverse z-Gaussian distribution. This shows that is close to a Gaussian distribution but that is non-Gaussian. We also show the mode of the distributions compared to the centre of the Gaussian prior¯and find that both the z-Gaussian distribution and inverse z-Gaussian modes are always biased low i.e. the z-Gaussian distribution is close to Gaussian but with a mode shifted away from the input Gaussian case. Therefore we conclude that if one performs parameter estimation on directly with no prior on any parameter this this should be unbiased. However when jointly fitting the degenerate free parameters and to data and will be biased towards lower values, and the mode of can be biased.

A.1. Application to Cosmic Shear
To explore this formalism we consider the cosmic shear power spectrum (equation 19) where the overall amplitude of the cosmic shear power spectrum is ∝ (1 + 2 ) , where we include the affect of a residual multiplicative bias (see equations 14). So in this case we have that is the fiducial (unbiased power spectrum) amplitude and = (1 + 2 ) will be marginalised over. For an unbiased case = 0 and we have the¯= 1, and for a biased case > 0 we have that¯> 1. Therefore we expect from the discussion in Section A that the amplitude of the power spectrum should be biased low when such marginalisation is performed, and the bias should decrease as the true value of nuisance parameter increases.
In Figure 8 we show how the mode of amplitude will change as a function of the width of the prior on for various cases of ( ) and the true value of . We find that for reasonable value of and , for a best case that = 0 that the mode of the marginalised amplitude value can be biased low by up to ∼ 1-2%. In all cases we assume a best-case that the prior is centred on the true value of , which in reality may not be the case. However the use of simulations (Hoekstra et al. 2017a) and/or additional information from the noise (B-mode) power spectrum will enable calibration of the mean of , which we explore in Section 3.