Low-scatter galaxy cluster mass proxies for the eROSITA all-sky survey

The on-going X-ray all-sky survey with the eROSITA instrument will yield large galaxy cluster samples, which will bring strong constraints on cosmological parameters. In particular, the survey holds great promise to investigate the tension between CMB and low-redshift measurements. The current bottleneck preventing the full exploitation of the survey data is the systematics associated with the relation between survey observable and halo mass. Numerous recent studies have shown that gas mass and core-excised X-ray luminosity exhibit very low scatter at fixed mass. We propose a new method to reconstruct these quantities from low photon count data and validate the method using extensive eROSITA-like simulations. We find that even near the detection threshold of ~50 counts the core-excised luminosity and the gas mass can be recovered with 20-30% precision, which is substantially less than the scatter of the full integrated X-ray luminosity at fixed mass. When combined with an accurate calibration of the absolute mass scale (e.g. through weak gravitational lensing), our technique reduces the systematics on cosmological parameters induced by the mass calibration.


INTRODUCTION
Galaxy clusters are the endpoint of the structure formation process and they represent the high-mass tail of the halo mass function. As such, their abundance and mass distribution is a sensitive probe of cosmological parameters (see Allen et al. 2011, for a review). Measurements of the galaxy cluster mass function have been obtained using X-ray surveys (Reiprich & Böhringer 2002;Böhringer et al. 2014;Mantz et al. 2015;Schellenberger & Reiprich 2017;Pacaud et al. 2018), the Sunyaev-Zeldovich effect (SZ, Planck Collaboration XX 2013;Planck Collaboration XXIV 2016;Hasselfield et al. 2013;Benson et al. 2013;de Haan et al. 2016;Bocquet et al. 2019), and optical surveys (Rozo et al. 2010;McClintock et al. 2019;Costanzi et al. 2019;Finoguenov et al. 2019), yielding competitive results on cosmological parameters with samples of a few hundred clusters. Recently, some tension has emerged between cosmological parameters measured from the power spectrum of the cosmic microwave background and from various low-redshift cosmological probes, including, but not limited to, galaxy cluster counts (Heymans et al. 2013;Hildebrandt et al. 2017;Riess et al. 2016;Riess 2019;Bonvin et al. 2017;Addison et al. 2016). If confirmed at a higher level of significance, this tension is indicative either of new physics or of unknown systematics in the various methods.
In this respect, the upcoming eROSITA mission has an important role to play. eROSITA (Predehl et al. 2010) is a wide-field X-ray instrument on board the Spektrum Roentgen Gamma (SRG) spacecraft, which was successfully launched on July 13, 2019. eROSITA features a large collecting area at 1 keV of ∼ 2, 500 cm 2 (on axis) and an angular resolution of 28 arcsec high-energy width (HEW) in survey mode over a field of view (FOV) of 50 arcmin diameter. Combined, these properties yield a survey speed (grasp) of 1050 cm 2 deg 2 , which represents an order of magnitude increase compared to previous X-ray instruments. eROSITA will perform eight all-sky surveys over a period of 4 years (Merloni et al. 2012;Clerc et al. 2018). The sensitivity of the final eROSITA survey will be about 20 times deeper in the [0.5-2] keV band than its predecessor the ROSAT all-sky survey. Upon completion, eROSITA is expected to detect 50,000-100,000 individual clusters (Pillepich et al. 2012(Pillepich et al. , 2018Grandis et al. 2019), which represents an increase by two orders of magnitude compared to current X-ray and SZ cluster samples. The eROSITA survey thus holds great promise for the future of cosmology.
To maximize the cosmological constraints from the survey, several key points need to be properly understood and modeled, which requires an accurate understanding of the physics governing the intracluster medium (ICM). In particular, efficient mass proxies must be defined from survey observables. An optimal mass proxy should combine two fundamental properties: cheap, and efficient. The proxy should be cheap in the sense that it should be measurable with good precision from the survey data. It should be efficient in the sense that the scaling relation between mass and observable should have a low scatter. At first approximation, the state of the gas in galaxy clusters is determined by the properties of the gravitational potential and the merging history of the host halo, which implies the existence of tight scaling laws between ICM properties and cluster mass (the self-similar model , Kaiser 1986;Bryan & Norman 1998). Thus far, most X-ray surveys made use of the total, integrated X-ray luminosity L X,tot as a tracer of the total mass (e.g. Ebeling et al. 2007Ebeling et al. , 2010Böhringer et al. 2013Böhringer et al. , 2017Giles et al. 2016). While the total X-ray luminosity is easy to recover from the survey data, its use as a cosmological tool is hampered by a large scatter at fixed mass (∼ 50%, e.g. Arnaud & Evrard 1999;Maughan 2007;Pratt et al. 2009;Mantz et al. 2010;Giles et al. 2016).
The large scatter of the L X,tot − M relation is caused by the diversity of observed gas properties in cluster cores (Croston et al. 2008;Pratt et al. 2009;Leccardi et al. 2010;Eckert et al. 2012;Ghirardini et al. 2019). In the central regions (R 0.2R 500 ), the gas densities are such that baryonic physics strongly affects the gas properties. Deviations from self similarity occur in the presence of non-gravitational physics such as gas cooling and feedback from supernovae and active galactic nuclei (AGN) (Tozzi & Norman 2001;Borgani et al. 2004;Kravtsov et al. 2005;Nagai et al. 2007). As a result, the density profiles of relaxed, cool-core (CC) clusters show a prominent cusp in their central regions, which causes them to be overluminous at a fixed mass with respect to systems exhibiting flat cores (non-cool-core (NCC) clusters). In addition to the large scatter at fixed mass, recent studies have shown unequivocally that X-ray flux limited samples are biased in favor of CC systems because of their higher luminosity and peaked surface brightness profiles (Eckert et al. 2011;Rossetti et al. 2016Rossetti et al. , 2017Andrade-Santos et al. 2017). The CC bias greatly complicates the modeling of the selection function, as the exact fraction of CC clusters is still poorly known (Eckert et al. 2011).
Conversely, beyond the central regions (R 0.2R 500 ) the gas density profiles exhibit a high level of self similarity and the scatter of the scaled profiles reduces to ∼ 15% (Croston et al. 2008;Eckert et al. 2012;Ghirardini et al. 2019). In this regime, baryonic effects are subdominant with respect to gravitational collapse. The scatter of the scaling relations drastically decreases (Mantz et al. , 2018Maughan et al. 2012) and the redshift evolution follows the self-similar expectation (McDonald et al. 2017). On top of that, numerical simulations implementing different subgrid models for baryonic physics make very similar predictions beyond the central regions (Sembolini et al. 2016b,a). Thus, cosmological simulations can be used to create reliable mock catalogues to calibrate the selection function. For all these reasons, Xray cluster samples based on core-excised quantities are well suited for cosmological studies.
In this paper, we present a novel method for PSF deconvolution and deprojection. The method is based on the decomposition of the observed brightness profiles into a sparse linear combination of basis functions. The basis functions are convolved with the point spread function (PSF) of the instrument and the model is fitted to the observed count profiles using a Hamiltonian Monte Carlo sampler. Our method represents a substantial step forward over traditional methods in the sense that it allows a large freedom in the reconstructed shape, yet at the same time it imposes physically motivated priors on the parameters, makes full use of the statistical properties of the data, and does not depend on the adopted radial binning. We validate the method using extensive eROSITAlike simulations and demonstrate that it is able to reconstruct accurately the true core-excised X-ray luminosity down to the detection limit of 50 source counts, independently of the core state. Assuming that our method can be combined with an accurate external calibration of the absolute mass scale, we then discuss the implications of our results for the cosmological constraints to be obtained with eROSITA. We distribute the code as a freely available, easy-to-use Python package 1 to allow for an easy replication of our results. For the purpose of this paper we implement a 1D version of the deprojection and PSF-deconvolution algorithm developed by Diaz-Rodriguez et al. (2017). Namely, we describe the observed profile as a linear combination of basis functions with an 1 sparsity term to avoid amplifying statistical fluctuations and enforce smoothness of the reconstructed profile. Here we describe the various steps of the reconstruction. We assume that the position of the cluster center is known a priori and that a surface brightness profile SB(r) has been extracted with the finest possible radial binning, albeit ensuring a minimum of 1 count per annulus.
2.1. The model For a given 3D profile (r), we assume that the true profile can be described as a sparse representation of basis functions {Φ p } P p=1 : with P the total number of basis functions used, R the 3D distance to the cluster center, and α p the sparse model coefficients, i.e. α p = 0 for most p. Following Eckert et al. (2016), for {Φ p } we use a collection of King functions, (2) King functions have the good property that they are monotonously decreasing and the correspondence between projected (2D) and deprojected (3D) profiles can be written analytically. However, the method proposed here can be readily generalized to any suitable basis. We set up a grid of values of R c,p and β p adaptively to provide a range of shapes that is as general as possible. For a profile with N data points with radii between R 0 and R max we set N/4 values of R c logarithmically spaced between R 0 and R max /2, and we draw 10 values of β p linearly in the range 0.6 − 3. Every combination of r c and β is considered, thus our final dictionary contains 5N/2 functions.
In observed (projected) space, the basis of functions can be readily translated into the functions φ p (r) as with r the projected radius. The coefficients γ p of φ p (r) are analytically related to α p as (see Appendix B of Eckert et al. 2016) with Γ the Legendre gamma function. We thus decompose the observed projected profile onto the basis {φ p } P p=1 and fit for {γ p } P p=1 . 2.2. Likelihood function In a photon counting X-ray instrument such as eROSITA, the observed number of counts N c in a given annulus is a Poisson realization of the expectation value µ. The expectation value in annulus i can be written as with A region , t eff the area of the annulus and the effective (vignetting corrected) exposure time in the annulus, and B i the background expectation. Here ⊗ denotes the convolution. The best fitting brightness profile will be described by the set of coefficients {γ p } P p=1 that maximizes the likelihood function (Cash 1979) To enforce a sparse representation on the basis of functions, the likelihood is modified by the modulus of the sum of the coefficients (the lasso method, Tibshirani 1996), such that the estimator {γ p } P p=1 can be written aŝ The addition of the lasso penalty term penalizes against complicated linear combinations of basis functions, thus ensuring a smooth and simple description of the brightness profile. The value of the thresholding parameter λ is set once and for all using Monte Carlo simulations of a smooth curve and selecting the smallest value of λ that reproduces the true curve within a given fractional precision, which we choose to be 1% (see Diaz-Rodriguez et al. 2017, for details). Therefore, our model is tuned to reproduce generic smooth curves with a typical precision of 1%, which given the properties of the eROSITA all-sky survey should always be smaller than the statistical uncertainties.
In the 2D method presented in Diaz-Rodriguez et al. (2017) point sources are detected and modeled on-the-fly by adding an additional component to the model in the form of a list of sources following a sparse representation across the image. In the faster 1D method proposed here this approach cannot be implemented directly, thus we apply a standard masking of the point sources detected through an external tool.

PSF convolution
To convolve the 1D model with the PSF, for a pre-defined grid of radii r i (usually defined as the emission-weighted mean of each annulus) we construct a PSF convolution matrix following Croston et al. (2006); Eckert et al. (2016). Assuming that the surface brightness distribution within each bin is roughly constant, which is a good approximation if the radial binning is fine enough, we define PSF i, j as the fraction of photons originating from bin i and measured in bin j. To construct the matrix, we use the model for the eROSITA survey PSF presented in Clerc et al. (2018, see their Fig. 6). The total PSF mixing matrix is defined as the mean of all the pixel integrals, i.e.
where PSF(r) denotes the normalized functional form for the PSF, the sums are performed over all the pixels in annuli i and j, |k − l| is the distance between pixels k and l, and N pix,i is the number of pixels in bin i.
In practice, performing such a double integral can be time consuming when the considered image is large. To speed up the computation, we perform the calculation in Fourier space using fast Fourier transforms (FFT). For each radial bin, we create an image with a flat brightness distribution within the corresponding annulus and filled with zeros elsewhere. We convolve the resulting image with the PSF kernel by multiplying the Fourier transformed images and then transforming back to real space. The corresponding row of the PSF matrix is then filled by counting the (normalized) fraction of the flux in the convolved image that falls within each annulus.
To verify the accuracy of the computation, we generated an image of a central point source and convolved it with the PSF. We then calculated the radial profile of the convolved point source. On the other hand, we generated the true count profile for the source (corresponding to a delta function) and multiplied it with the PSF matrix in 1D. In Fig. 1 we compare the 1D convolution with the 2D one. The 1D convolution method, albeit much simpler, reproduces the results expected from the 2D convolution within less than 1% out to 8 arcmin from the core of the PSF, where the flux of the wings is over 3 orders of magnitude smaller than in the core. Thus, we conclude that for our purposes performing the PSF convolution in 1D is largely sufficient.
Another advantage of our approach is that once the PSF matrix is computed, the calculation of the expectation value µ i (see Eq. 5) can be written as a linear function of the coefficients {γ p }. Once the radial binning is set, the basis functions can be evaluated once and for all on the pre-defined grid, and we can construct a N × P convolution matrix K including all the terms in Eq. 5 such that µ becomes the matrix product of the vector of coefficients γ with the matrix K. Thus, the problem reduces to solving a generalized linear model.

Optimization
To solve the lasso optimization problem (Eq. 7), we implement the model (Eq. 5) as a generalized linear model within the probabilistic programming framework in the Python package PyMC3 (Salvatier et al. 2016). We use the No-U-Turn Sampler (NUTS, Hoffman & Gelman 2014) as implemented in PyMC3 to sample the posterior distributions of the parameters. NUTS is a Hamiltonian Monte Carlo sampler that makes use of gradient information for fast convergence. This is required to sample efficiently high-dimensional problems (P > N) such as the case presented here. In addition, NUTS includes several automatic ways of tuning the Hamiltonian Monte Carlo sampler to improve the sampling efficiency without requiring any user intervention, which makes it suitable for use on large datasets.
We first run a traditional maximum likelihood optimization and use the resulting parameter set as a starting point for NUTS. We set broad log-normal priors on the parameter values around the maximum likelihood results to enforce positivity and set parameter boundaries, whilst still leaving a high level of freedom to the sampler. We then draw 1,000 posterior samples using NUTS.
In case the expectation value of the background B i is known, the corresponding parameter in Eq. 5 can be fixed. Generally speaking, the background value is only known within some range. The uncertainty in the background can be propagated to the posterior distributions by adding it as one or more additional model parameters, on which priors can be set according to the level of accuracy on the background expectation. For the purpose of this paper, we assume that the background is constant across the field of interest and that its true value is known with an accuracy of 5%, which is similar to what can be achieved with XMM-Newton (e.g. Ghirardini et al. 2018). Thus, we include one additional model parameter to describe the background, and we set a Gaussian prior on the background parameter with a mean centered on the expected value and a standard deviation of 5%. The method can be easily generalized in the case of spatially varying background by supplying a background map, whose value will be used as a prior for the expectation values of B i in Eq. 5.
In Fig. 2 we show a test of profile reconstruction on a simulated profile of a compact source with known background and high statistics (10,000 input source counts). From the true profile, we created a spherically symmetric image of the source, which was then convolved in 2D with the eROSITA PSF. A Poisson realization of the convolved image was generated, and our reconstruction and optimization technique was applied to the simulated data. The comparison between the reconstructed and true profiles (bottom panel) shows that our technique is able to reproduce the true profile shape all the way to the center with differences of at most 5% in case the true PSF and background is known.

Density profile and mass proxy estimation
Once the optimization has been performed and the output samples of the parameters γ have been estimated, the projected and 3D profiles can be estimated using Eq. 1, given the analytical conversion Eq. 4. The conversion between count rate and emissivity can be estimated in a standard way by computing the cooling function Λ(T, Z) in the energy band of interest with a plasma emission code such as APEC (Smith et al. 2001) or SPEX (Kaastra et al. 2013). For soft X-ray bands (e.g. [0.5 − 2] keV), the cooling function is nearly independent of temperature and metallicity as long as the gas temperature exceeds ∼ 3 keV. See for instance Eckert et al. (2016) for a detailed description of the process.
The core-excised luminosity in the radial range [r 1 , r 2 ] can then be evaluated by integrating the output profile,  Rec/True with n e (r), n p (r) the number densities of electrons and protons, respectively. The emission measure is proportional to the surface brightness with a proportionality constant C that depends on the telescope's effective area. Given that EM(r) is written as a linear combination of basis functions (Eq. 1), L X,ce can itself be written as a linear combination with coefficients L p proportional to γ p . The integral of the basis functions {φ p } can be written analytically as The gas density profile can be reconstructed by converting the 2D emissivity profile into 3D (Eq. 1), with α p , Φ p as defined in Sect. 2.1 and µ ep = n e /n p ≈ 1.17 the ratio of the number density of electrons to that of protons in a fully ionized plasma. The posterior distribution of M gas can also be obtained by integrating Eq. 12 over the volume (for details see Eckert et al. 2019).

MONTE CARLO SIMULATIONS
To validate the procedure, we performed two sets of Monte Carlo simulations and quantified the ability of our algorithm to recover the core-excised luminosity, gas mass and gas density profile of galaxy clusters even in the low-count and resolution-limited regime, as will be the case for most sources detected in the course of the eROSITA all-sky survey. First, we start from the simple case of spherical clusters following a beta-model (Cavaliere & Fusco-Femiano 1976) and a uniform background. Second, we simulate real morphologies determined from deep X-ray images of a large set of clusters, and include a realistic AGN distribution. In both cases, we use SIXTE (Dauser et al. 2019) to generate realistic eROSITA simulations including a wide variety of instrumental effects.
3.1. Beta-model simulations As has been done in numerous previous papers (e.g. Pacaud et al. 2006;Clerc et al. 2012Clerc et al. , 2018Käfer et al. 2019a), we start by running synthetic simulations of individual systems with a gas distribution following a single spherically symmetric beta-model (Cavaliere & Fusco-Femiano 1976), We fix β to the value of 2/3 usually measured in local clusters (e.g. Mohr et al. 1999;Chen et al. 2007;Käfer et al. 2019b). We then vary the core radius r c in the range 10-60 arcsec and the central normalization S 0 , assuming R 500 = 5r c . We simulate eROSITA observations using SIXTE for a single, uniform exposure time of 3 ks similar to the median exposure time of the final all-sky survey and the field-of-view averaged PSF and a uniform background (Clerc et al. 2018), assuming a constant gas temperature. From the generated event files we extract photon images in the [0.5-2] keV band with a pixel size of 4 × 4 arcsec and the corresponding exposure maps including correction for vignetting.

Core-excised luminosity estimation
We accumulate the photon counts from the simulated maps in circular annuli at maximum resolution (4 arcsec). We reconstruct the profile following the procedure devised in Sect. 2. We extract the profiles out to 10 arcmin from the cluster center, i.e. always well beyond R 500 . This allows a robust determination of the background level outside of the cluster. The parameters describing the source and the background are fitted jointly, such that the output profile is marginalized over the uncertainty in the local background level. We model the background as a constant in Eq. 5 and set a Gaussian prior centered on the true background value and a standard deviation σ = 0.05, i.e. we assume that the local background value can be determined a priori with an accuracy of 5%. We build a PSF mixing matrix using the method described in Sect. 2.3 and then optimize the problem using the NUTS algorithm as described in Sect. 2.4. We then compute the integrated coreexcised count rate for each step in the output chains, and determine the median and 68% confidence interval from the posterior distribution of values.
In Fig. 3 we show the comparison between the true and the reconstructed core-excised count rates in the [0.2 − 0.8]R 500 radial range for 1,000 beta-model simulations. Each point is the median of the posterior distribution for a single realization. For clarity, the results are color coded by the number of total source counts in the simulated profiles. We can see that the reconstructed count rates closely follow the input parameters, even in the low count rate regime. The dispersion of the points around the one-to-one line decreases with increasing number of counts, as expected for purely statistical uncertainties.

Bias and uncertainties
The idealized beta-model simulations can also be used to calibrate the level of bias introduced by the reconstruction technique and to test the accuracy of the reconstructed error bars. In the left-hand panel of Fig. 4 we show the deviations from the one-to-one relation, defined as the log of the ratio of true to reconstructed count rates. We group the data in bins of true count rate, and measure the median and dispersion of the points in each subset. Our procedure is able to measure the core-excised count rate in the idealized simulations with no measurable bias (always better than 5%) and a dispersion that closely follows the input count rate.
In the right-hand panel we compare the error bars determined by our algorithm (split into upper and lower 1-σ uncertainties) with the dispersion of the points around the median, as determined in the left-hand panel of the figure. The data points are plotted against the number of true core-excised counts N ce ([0.2 − 0.8]R 500 ) of each realization. We find that the error bars calculated from the posterior distributions closely match the dispersion of the data points, indicating that our error bars are accurate. For comparison, we also show the expectation from pure Poisson statistics, in which case the variance should scale as N ce . We can see that the error bars in the reconstructed count rate always exceed the Poisson expectation, which is unsurprising given that deconvolution from the PSF introduces an additional level of complexity. The error bars obtained with our method are increased by ∼ 30% with respect to the case of an ideal PSF. Therefore, we conclude that our deprojection and PSF deconvolution method is able to measure accurately the core-excised flux of eROSITA clusters even in the low-count regime, albeit with a moderate increase in the statistical errors.
In practice, eROSITA is not expected to be able to distin- guish extended sources from point-like sources for less than ∼ 40 source counts (Pillepich et al. 2018;Grandis et al. 2019;Käfer et al. 2019a). The threshold of 40 counts corresponds to a relative uncertainty of ∼ 30% on the core-excised count rate. This number should be put in perspective of the expected scatter in the L X − M relation. For instance, Maughan et al. (2012) measure a scatter 67% for the L X −T relation, which decreases to 29% when excising the central regions (R < 0.15R 500 ). The scatter may be decreased even further when excising the core out to 0.2R 500 , given that the median self-similar scaled profiles of the CC and NCC populations are indistinguishable beyond that radius (Käfer et al. 2019b;Ghirardini et al. 2019). Mantz et al. (2018) suggest that the scatter of the relation between core-excised L X and M 500 could be as low as 10% for massive clusters. Therefore, the lower scatter of the L X,ce − M relation more than makes up for the modest increase in statistical uncertainties compared to the total integrated L X .
3.2. Mock eROSITA field To extend our tests beyond the idealized case of a single beta-model, we apply our technique to a simulated field tuned to reproduce the expected properties of the eROSITA Final Equatorial Depth (eFEDS) survey, which was observed by eROSITA during the performance verification (PV) phase. eFEDS targets a 120 deg 2 field at the final depth of the eROSITA survey and serves as a test bed for the expected performance of the instrument. Most of the survey area lies within the footprint of the Subaru HSC-SSP survey (Miyazaki et al. 2018), which will readily provide an optical identification of the detected clusters, precise photometric redshift estimates thanks to the multi-color nature of HSC-SSP, and weak lensing mass calibration. The clusters in the simulated eFEDS field are drawn from real X-ray images of known clusters, thus the simulation encompasses a wide variety of radial shapes and morphologies. Here we assess the performance of our algorithm on the simulated eFEDS field.

The eFEDS mock
The creation of the mock cluster sample follows the procedure outlined in Grandis et al. (2019). We draw a cluster catalogue with halo masses M 500c and redshifts from the Tinker et al. (2008) halo mass function and distribute the halos uniformly over the eFEDS footprint. For each cluster, we then assign luminosities in the rest frame [0.5 − 2] keV band and temperatures using the scaling relation and the scatter around that relation reported by Bulbul et al. (2019) for 59 XMM-Newton follow up observations of SPT-SZ selected clusters. We assume that the scatter in luminosity and the scatter in temperature are uncorrelated. The metallicities of the clusters are set to Z = 0.3 Z (McDonald et al. 2016). We refer to Grandis et al. (2019) for more details on the mock catalogue creation.
The cluster spectrum is predicted with an APEC model, using the aforementioned temperature and metallicity, as well as the neutral hydrogen column density at the cluster position. The cluster 2D surface brightness is simulated by randomly selecting one of 83 Chandra images of SPT-SZ selected clusters (Sanders et al. 2018) or 37 XMM-Newton images of nearby lower mass (0.7 to 7 × 10 14 M ) objects drawn from the eeHIFLUGCS sample (Ramos-Ceja et al. 2019) which fit within the XMM-Newton field of view. Masses for these input systems were assumed to be the value from the SPT catalog or using XMM-Newton (Piffaretti et al. 2011), as appropriate. The images of these clusters were adaptively smoothed to have a signal to noise ratio of 8-10 within the smoothing kernel, also subtracting background. As the input images for the simulation had to be zero or positive, negative values in the images were replaced by zeros and a tapering was applied to the cluster image at larger radii to help remove sharp edges and counteract the additional flux redistribution towards the outskirts that the zeroing would create.
For a mock cluster with a mass of greater than 10 14.55 M , a random input image was taken from those real clusters with masses greater than that threshold in an appropriate redshift bin (< 0.4, 0.4-0.6, 0.6-0.8, 0.8-1.0 and > 1.0). For mock cluster masses lower than this, a random image of a cluster in mass bins of 10 14.40 -10 14.55 , 10 14.22 -10 14.40 and < 10 14.22 M was chosen. The image was scaled in size by the ratio of R 500c of the catalog cluster in arcmin compared to the real cluster and randomly rotated. The flux in the simulation was set to be that in the mock catalog. Correlations between the morphological properties of the clusters and the scatter in luminosity and temperature are neglected.
Active galactic nuclei (AGN) are added to the mock catalog following the approach presented in Comparat et al. (2019). The position and brightness of simulated AGN follows the expected luminosity function and halo occupation distribution of the AGN population, such that the simulated point-like sources follow realistic brightness and clustering properties.

Reconstruction
The full eFEDS field was simulated with SIXTE using the same attitude file as for the actual survey. The simulated data were processed using the standard eROSITA software package (eSASS). From the simulated event files, we extracted mock eROSITA images in the [0.5-2] keV band combining all 7 telescope modules. A merged exposure map was computed using the eSASS task expmap. We used the standard eSASS detection algorithm (Clerc et al. 2018) to detect pointlike sources in the field and prepared a point-source mask to excise circular regions around each point source, with an exclusion radius that is proportional to the logarithm of the detected count rate and a minimum radius of 15 arcsec.
From the original eFEDS mock catalogue, we performed a simple cut on the true, total flux and selected all the halos with f X > 3 × 10 −14 erg/s/cm 2 , which yields a total of 445 simulated sources. In each case, we masked the surrounding point sources and extracted the count profile in circular annuli, as was done for the idealized beta-model simulations. We fixed the redshift of each cluster to the catalog value and ran the reconstruction of the surface brightness profiles, Xray luminosity (total and core excised), gas density profile, and integrated gas mass.
In Fig. 5 we show the result of the reconstruction of the X-ray luminosity within R 500 . As a first step, we assume that the value of R 500 is known and fixed to the catalog value. We show the results both for full, integrated luminosity (left-hand panel) and for the core-excised luminosity in the [0.2 − 1]R 500 range (right-hand panel). The data points are color coded by the number of reconstructed net source photons. We can see that the both the total and the core-excised luminosities are recovered very accurately by our method, down to the detection threshold of ∼ 50 counts. Given that the sample selection was performed on the true simulated flux, a number of systems actually exhibit a very low number of counts (< 50) because their exposure time is low. These systems would not be detected in the real survey. While their recovered fluxes are not necessarily biased (see Fig. 4), they exhibit a large statistical scatter. Thus, in Fig. 5 we display only the clusters for which the true number of source counts is at least 30, which still encompasses all the detected systems. Fitting the relations with a power law with free intrinsic scatter, we find that the luminosities are recovered with no measurable bias and an intrinsic scatter σ ln L X,tot = 0.12 ± 0.02 and σ ln L X,ce = 0.21 ± 0.04. Therefore, excising the central regions moderately increases the scatter in the reconstruction of the true cluster L X .
Around the detection limit the relative uncertainty in L X,ce is about 30% and it decreases for brighter sources. For the entire population of 445 sources, the relative uncertainty can be approximately described by a power law with an index of −0.5 as expected in the case of Poisson statistics, From the reconstructed profiles we can readily reconstruct the gas density profiles (Eq. 12) and the gas mass by integrating the gas density over the volume. Since the gas density is a slow function of the X-ray surface brightness, the posterior uncertainties in M gas are smaller than on L X,ce . Again approximating the error budget for M gas as a power law of the number of source counts, we obtain the relation i.e. around the detection limit M gas can be reconstructed with an uncertainty of about 20%. The relative uncertainty in the gas mass is lower than in the core-excised luminosity since the gas density is proportional to the square root of the emissivity.

Mass estimation
Assuming an externally calibrated scaling relation between core-excised L X and total mass, we can use the core-excised luminosity as a low-scatter proxy to reconstruct the mass of the detected galaxy clusters. To do so, we developed an iterative procedure based on the reconstructed brightness profile deconvolved from PSF effects. Our input catalogue for the eFEDS mock encodes an L X,ce − M 500 relation which can be described as L X,ce = 3.6 × 10 42 M 500 10 14 M 2 E(z) 7/3 ergs/s with E(z) = Ω m (1 + z) 3 + Ω Λ used to correct for the selfsimilar evolution of the L X − M relation. The input log-normal intrinsic scatter of the relation is σ ln M = 0.18 at fixed L X,ce . Given the lower statistical uncertainty on M gas compared to L X,ce , a similar procedure can be put together with the gas mass as a mass proxy, potentially leading to more accurate results. However, for low cluster temperatures (T 3 keV) the cooling function Λ(T, Z) is a strong function of metal abundance, which is impossible to measure from survey data in most cases. This introduces an additional source of systematic uncertainty in the reconstruction of M gas , whereas L X,ce is much less affected by this potential bias. For hightemperature clusters (T > 3 keV) M gas can potentially be used as a highly effective proxy for the cluster mass.
Assuming that the scaling relation and the cluster redshift are known, we set up an iterative procedure to determine the aperture within which L X,ce should be integrated (i.e. R 500 in angular coordinates) and eventually the cluster mass. We start by making a guess for the value of R 500 , which we take to be 700 kpc. We integrate the count rate and then the luminosity within the corresponding aperture, and go through the scaling relation (Eq. 16) to get a new estimate of M 500 and hence of R 500 . We repeat the procedure until it converges, i.e. until the difference of R 500 values between two iterations is less than 1 kpc. Convergence usually occurs after ∼ 5 iterations.
To validate this procedure, we compared the values of L X,ce computed through this technique with the values recovered inside the true R 500 aperture as in Sect. 3.2.1. We found that our iterative procedure recovers the correct aperture very accurately, with typical differences in L X,ce of less than 5% and a median ratio L iter /L 500 = 1.01. Thus, the uncertainty induced by the unknown aperture of the detected systems is much smaller than the statistical uncertainty and the L X − M scatter.
Finally, we compared the values of M 500 determined through our iterative procedure with the mass given in the input catalogue. The results are presented in Fig. 6, where we show the reconstructed mass as a function of the true mass color coded by the number of source counts for the 374 clusters of the eFEDS mock with N c > 30. We omit the systems with N c < 30 as they have very large uncertainty and are substantially below the detection limit of the survey. We can immediately see that the reconstructed mass traces well the true mass, with a slope and a normalization close to unity. Fitting the relation with a linear function (i.e. fixing the slope to unity), we measure M 500,rec /M 500,T rue = 1.04 ± 0.03 and an intrinsic scatter σ ln M 500,rec = 0.21 ± 0.02. This value is only mildly larger than the input intrinsic scatter of the simulation (σ ln M = 0.18). Therefore, we can conclude that even in the case of a realistic population of simulated clusters spanning a wide variety of shapes and redshifts, the scatter of the output relation at fixed mass is dominated by the intrinsic scatter of the L X,ce − M 500 relation.
We note that slope of the L X,ce − M relation is relatively steep. Therefore, deviations from the true value of L X,ce imply smaller deviations from the true M 500 , and the scatter introduced by the reconstruction of L X,ce is subdominant with respect to the intrinsic scatter of the L X,ce − M 500 relation.

Implications on mass estimation
Our results on the observational scatter in the reconstruction of the cluster's gas mass (Eq. 14 and core-excised luminosities (Eq. 15) indicate that for all object in an eROSITA selected cluster sample, low scatter mass proxies can be derived. More specifically, assuming the characteristic limiting photon count of N c > 50 for selection by eROSITA (Pillepich et al. 2012(Pillepich et al. , 2018Clerc et al. 2018;Grandis et al. 2019), the observational scatter on the gas mass will be bound to be < 20% for all selected clusters. Such a precision on the derived mass is comparable to what is achieved for the SZ effect. For instance, in the particular case of the SPT survey, the default mass proxy is the detection significance ζ (for instance Vanderlinde et al. 2010;Bleem et al. 2015;Bocquet et al. 2019), which is related to, but not strictly equivalent with, the integrated Compton parameter Y. Its observational uncertainty is given by a Gaussian with variance 1, which for the typical selection criterion > 5 yields a fractional observational uncertainty < 20%. The intrinsic scatter in mass from SZ derived masses is typically ∼ 0.11 (see, e.g. Bocquet et al. 2019), while the typical mass scatter from gas masses is ∼ 0.08 (see, e.g. Bulbul et al. 2019). The method presented in this paper thus provides a mass proxy for all eROSITA selected clusters with a total scatter (observational + intrinsic) at fixed mass which is comparable to the total mass scatter for the mass proxies of SZ selected cluster samples. X-ray derived masses for eROSITA selected clusters will thus be at least as precise as SZ derived masses for SPT selected clusters.

CONCLUSION
In this paper, we presented a novel method to reconstruct the morphological and photometric properties of galaxy clus-ters from X-ray survey data in the low count rate regime. Our method is based on the linear decomposition of the observed profile onto a basis of functions and makes use of the linear nature of the projection kernel to recover the 3D profiles. We correct for PSF effects by convolving the 2D model with a mixing matrix that can be computed numerically for any PSF shape. Our method is readily applicable to the upcoming eROSITA survey, which is now ongoing.
We validated our method using large standardized sets of simulations, both in the idealized case (beta-model) and in a more realistic case based on real observed clusters. In both cases, we find that our method can recover the input properties with no evidence for biases down to the survey detection threshold of ∼ 50 counts. This allows us to derive in a robust manner core-excised X-ray luminosity and the gas mass, which are known to be low-scatter proxies of the total mass. Unlike the full integrated X-ray luminosity, they are relatively unaffected by the properties of cluster cores, which are strongly affected by baryonic physics.
Assuming that an accurate external scaling relation between cluster mass and mass proxies is available, we set up an iterative procedure to determine the total mass from our lowscatter mass proxies. In the specific case of the core-excised luminosity as a mass proxy, our method can recover the true cluster mass with a very modest increase in intrinsic scatter (from 0.18 to 0.21 in our simulation). This is to be contrasted with the scatter of the full integrated L X at fix mass, which is known to be ∼ 0.6. Therefore, excising cluster cores clearly improves the accuracy of the mass estimates compared to the total integrated flux and renders X-ray mass proxies comparable in precision to SZ mass proxies. For easy replication of our results, we distribute our code in the form of the public Python package pyproffit 2 , which is available on Github and PyPI.