Bridging the Gap Between Simply Parametrized and Free-Form Pixelated Models of Galaxy Lenses: The Case of WFI 2033-4723 Quad

We study the radial and azimuthal mass distribution of the lensing galaxy in WFI2033-4723. Mindful of the fact that modeling results depend on modeling assumptions, we examine two very different recent models: simply parametrized (SP) models from the H0LiCOW collaboration, and pixelated free-form (FF) GLASS models. In addition, we fit our own models which are a compromise between the astrophysical grounding of SP, and the flexibility of FF approaches. Our models consist of two offset parametric mass components, and generate many solutions, all fitting the quasar point image data. Among other results, we show that to reproduce point image properties the lensing mass must be lopsided, but the origin of this asymmetry can reside in the main lens plane or along the line of sight. We also show that there is a degeneracy between the slope of the density profile and the magnitude of external shear, and that the models from various modeling approaches are connected not by the mass sheet degeneracy, but by a more generalized transformation. Finally, we discuss interpretation degeneracy which afflicts all mass modeling: inability to correctly assign mass to the main lensing galaxy vs. nearby galaxies or line of sight structures. While this may not be a problem for the determination of $H_0$, interpretation degeneracy may become a major issue for the detailed study of galaxy structure.


INTRODUCTION
Gravitational lensing is the best way to obtain the detailed projected mass distribution of massive galaxies at intermediate redshifts, acting as lenses for background sources. Two main types of lens mass models are used in the literature, simply parametrized (SP) and free-form (FF).
The motivation for using SP models is that the central-few percent of the virial radius-regions of galaxies are mostly in equilibrium, as evidenced by observations and simulations, and so can be well approximated by power law density profiles, or other simple parametric functions. The influence of outside mass is parametrized by external shear, added to the lensing potential. (Shear is similar to tides and has no associated mass.) The motivation for using FF modeling approach is that SP models are likely too idealized to represent real galaxies, and may not capture the full range of mass features influencing image properties. FF methods do not use simple parametric profiles and instead allow lensed image properties considerable freedom in determining the mass distribution of the lensing galaxy. The drawback bern.barr@gmail.com llrw@umn.edu is that these can allow unphysical models to be generated and become part of the ensemble average. Pix-eLens (Saha & Williams 2004) and glass (Coles et al. 2014(Coles et al. , 2018 pixellate the lens plane and let each mass pixel vary independently, with a few weak superimposed constraints to discourage unphysical mass models. Most published works to date assume SP forms as the galaxy lensing model. The most common models are elliptical power law density profiles with slopes near 'isothermal', or two co-centered mass components, for example, Sersic (Sersic 1968) to represent baryons, and NFW (Navarro et al. 1997) to represent dark matter.
Some years ago it became clear that SP mass models cannot account for the observed image flux ratios in many multiply imaged quasar systems (Falco et al. 1997). At optical wavelengths, image fluxes are most likely affected by microlensing by stars in the main lensing galaxy. Since microlensing operates on much smaller spatial and mass scales, it can be studied separately, and is usually disregarded when the focus is on the galaxy-scale macromodel. Flux ratios at longer wavelengths are not subject to microlensing, but can be affected by dark matter substructure (Mao & Schneider 1998;Metcalf & Madau 2001;Chiba 2002), or other mass features in the macro-model. To explore the latter, attempts were made to use smoothly perturbed macromodels (Trotter et al. 2000;Evans & Witt 2003). While in some quads these models could account for flux anomalies, they often resulted in unphysical projected iso-density contour shapes (Biggs et al. 2004). The substructure predicted by the Cold Dark Matter Universe model with the Cosmological Constant (ΛCDM), or line of sight (LoS) structure (Gilman et al. 2019;Nierenberg et al. 2020), and, in some cases, edge-on disks (Hsueh et al. 2016) are probably responsible for some, but not all flux anomalies, and there remains a tension between predictions and observations (Xu et al. 2015;Hsueh et al. 2018).
Given the shortfalls of SP and FF modeling strategies, a different approach may be warranted. (Even though no approach is perfect!) In this paper we use macromodels which are, in a sense, a compromise between SP and FF strategies. Similar to SP models, our models consist of simply parametrized mass components, each with its own independently varying normalization, core, density slope, ellipticity and position angle. Unlike the SP models, the centers of our two components are allowed to be offset. A superposition of such components usually results in lopsided mass distributions, but the center of mass of our models is required to coincide with the center of light to within astrometric precision, which is not the case for SP models.
Because the number of our free parameters exceeds the number of constraints, our method, like FF, produces a large number of solutions, instead of just one macromodel with statistical uncertainties, as is the case with standard SP. The reason for multiple solutions is the existence of lensing degeneracies, i.e., there is no guarantee of uniqueness. In fact, as was pointed out by Wagner (2019), lensing observables provide us with local lens properties only, the rest we obtain by filling in the information gaps with our assumptions, or non-lensing observables. So to obtain robust model uncertainties, one needs a wider range of models, which implies a larger number of adjustable parameters.
Thus, we borrow from both the SP and FF approaches. From SP we take the simple shape of the building blocks of galaxy structure, which are physically motivated. From FF we take the idea of multitude of solutions, all reproducing image observables within observational error, and therefore connected by lensing degeneracies.
The motivation for the two offset component model comes from recent work. Based on analysis of a population of galaxy quads, Gomer & Williams (2018) conclude that some fraction of lens galaxies needs to be lopsided. Nightingale et al. (2019) perform modeling of a handful of individual quad lenses, and conclude that off-centered components are required. Williams & Zegeye (2020) find that lopsided mass maps can bring quadruply imaged Supernova Type Ia iPTF16geu into agreement with microlensing constraints, while still allowing the galaxy profile to be steep at large radii, and have coincident light and mass centroids.
In this paper we model WFI 2033-4723 (hereafter referred to as WFI2033), a quad lens first reported by Morgan et al. (2004). Our motivation for choosing it is that it is one of six H0LiCOW systems, and its quasar point images and lens light center are poorly fit by power law and composite models (Rusu et al. 2020).
The recent galaxy-lens modeling literature is heavily dominated by the goal of measuring H 0 . In this paper we are interested in placing constraints on the galaxy mass distribution. We elaborate on our choice of constraints in Section 2, and in Section 3 we describe our fitting procedure, which generates models that fit quasar image data to observational precision. The paper's goal is to determine what can currently be said about the structure of the lens galaxy in WFI2033. Two different aspects of this goal are addressed in Sections 4 and 5. In Section 4 we present 3 sets of models, built with 3 different sets of constraints, which allows us to evaluate their respective model constraining power. In Section 5 we compare our models with published models, identifying similarities and differences. Section 6 summarises our findings.

WFI 2033-4723 AND DATA CONSTRAINTS
WFI2033 is a quadruply imaged quasar with source and lens redshifts of z src = 1.662 and z lens = 0.6575 (Sluse et al. 2019). We use image-positional data from COSMOGRAIL X (Sluse et al. 2012), and time delay data from COSMOGRAIL XVIII (Bonvin et al. 2019). While most current papers on galaxy scale lenses focus on H 0 , our interest is in the lens plane mass distribution. Because we do not want to bias our results with a fixed value of H 0 , we do not use the actual time delays, but only their ratios, which eliminates the dependence on H 0 . Thus, we define the two independent ratios r 1 = ∆tB−A2 ∆tB−A1 = 1.0304 ± 0.1329 and r 2 = ∆tB−C ∆tB−A1 = 1.6409 ± 0.1243, which are used to fit our model parameters to time delay data. The dependence on other cosmological parameters, mostly Ω's, remains, but it is much weaker.
One of the sets of models we present uses three image flux ratios, obtained from narrow line region of the source quasar, which is unaffected by stellar microlensing (Nierenberg et al. 2020;Gilman et al. 2020b), and hence probes either ΛCDM substructure, or macro-scale potential of the galaxy, or a combination of the two. The three flux ratios are, f 1 = F B /F A1 = 0.50 ± 0.050, f 2 = F A2 /F A1 = 0.65 ± 0.055, and f 3 = F C /F A1 = 0.53 ± 0.048.
In addition to constraints from the quasar point images, many papers use the lensed image of the extended galaxy hosting the source quasar, i.e., the extended ring. Also, to constrain the very central region, well within the image ring, analyses often use velocity dispersion of the lensing galaxy. In the following subsections we argue that not all data is of equal value for modeling, since some have intrinsically lower quality observations, and others are more reliant on prior assumptions.

Constraints from point images
Quasar image positions require no additional assumptions or modeling priors: point images arise from the same very small, practically point-like location in the source plane. With HST observations, these have astrometric accuracy of ∼ 0.003". In the absence of microlensing, quasar image time delay also require no priors. Fortunately, the effect of microlensing is small, similar or smaller in magnitude to the time delay measurement uncertainties (Rusu et al. 2020). So the use of time delays Physical quasar sizes in the mid-IR to radio wavelengths, or in the narrow line regions are too big to be influenced by stellar microlensing. Instead, they are affected by anything from ΛCDM substructure and LoS halos, to macro-scale mass features in the lens galaxy. From the data alone it is not possible to differentiate between these possibilities. The method adopted in the literature is to assume a form for the lensing galaxy, either analytical or extracted from simulations, and ascribe deviations from model predictions to small-scale substructure. The reliability of this approach depends on how well the model describes the real lens on length and mass scales that are a fraction of the Einstein radius and typical κ values of the main lens.
In this paper we do three levels of mass reconstruction, each incorporating successively more constraints. Our Group A models use image positions only, Group B models use both image positions and time delay ratios, and finally Group C further adds image flux ratios to image positions and time delay ratios; see Table 1. In other words, Groups A and B reconstructions assume that flux ratios are due to small-scale substructure and therefore cannot be used to constrain the macro models. Group C reconstructions assume that flux ratios are the result of perturbations in the macro-model, and so are included as a constraint.

Constraints from extended rings
The galaxies hosting the source quasars are lensed into extended rings that cover hundreds or even thousands of HST CCD pixels, and sample the lensing potential over a considerable portion of the lens plane. However: (i) each pixel is 0.03"-0.05", which is worse than the astrometric precision of point images by a factor of 10 in linear scale, or a factor of 100 in area, and (ii) their surface brightness is usually low, and has to be separated from the light of the lensing galaxy, leading to low signal-to-noise ratio. Furthermore, (iii) a lensing system comprises a lens and a source, and the two are somewhat interdependent in modeling, leading to lens-source degeneracy. Since we do not know the true appearance of the extended source (unlike in the case of a point source), complexity in the structure of the lens can to some degree be attributed to the source, or vise versa.
Usefulness of rings in modeling has been debated in the literature. Kochanek et al. (2001) was one of the first papers to analyse the information contained in the extended rings, and argue that these could be very useful, even "revolutionary" in constraining lens mass models and breaking degeneracies. However, their modeling made simplifying assumptions about the shape of the lens and the source. In particular, the isophotes of the central region of the source were taken to be ellipsoidal, while the actual lens-reconstructed sources have non-negligible deviations from ellipticity (see images of reconstructed sources in Vegetti et al. 2010aVegetti et al. ,b, 2012. Saha & Williams (2001) showed that the 4 point images of a quad lens already determine the overall structure of the extended host galaxy image. Many other papers, for example, the H0LiCOW collaboration  argue that the details of the ring structure help constrain the fit. Walls & Williams (2018), using a somewhat optimistic set up to represent the ring in synthetic lenses-an array of point images-conclude that the ring provides useful constraints in about half of the cases. Gomer & Williams (2021), also using an array of point images in 10 different synthetic lens types, showed that including rings in the estimation of H 0 improved it for 2 out of 9 simpler lens types, but did not improve it for the set of lenses with most realistic mass distributions. The utility of rings can vary between systems, or even analyses. Mörtsell et al. (2020) found that in iPTF16geu the extended ring did not help in modeling, while Williams & Zegeye (2020) found that some mass models did differ in their predicted rings, if sufficient resolution and high S/N were available. Denzel et al. (2021) used a different method to incorporate extended ring information. After generating 1000 individual glass models, they selected a subset that reproduced the extended rings reasonably well. However, the predicted H 0 distribution from this subset did not differ from that of the whole sample.

Constraints from lens galaxy kinematics
Kinematics of the lens galaxy, namely the central velocity dispersion has been used extensively in H0LiCOW papers. However, the use of velocity dispersion relies on assumptions. The commonly used Jeans analysis assumes geometrical sphericity, with a velocity anisotropy profile form parametrized by a scale length. The effect of the assumptions on kinematics can be gauged from the results of Birrer et al. (2016) in the case of the quad RXJ 1131-1231. They find that the choice of priors on velocity anisotropy impacts their cosmological inference; they derive H 0 = 86.6 +6.8 −6.9 km s −1 vs. H 0 = 74.5 +8.0 −7.8 km s −1 depending on the prior. While this does not say what it would imply for the study of the lens galaxy mass distribution, it is reasonable to assume that the results will be affected. Gomer & Williams (2020) study the effect of including kinematics in their synthetic two-offsetcomponent mass models, which are then fitted with single power laws to emulate common practice. They find that Jeans modeling pulls the best fit lens models in the direction of steeper density profiles than those of the true lens. They attribute this bias to the fact that the fitted models can never be an exact match to the projected mass distribution of the lens. Even though this study focused on H 0 , its results suggest that the inference on galaxy structure depends on the model used for kinematics.

Constraints from extended rings and kinematics in simple lenses
The studies discussed above use either real lens systems, or synthetic ones with two mass components. The cleanest test of the usefulness of extended rings and kinematics is to use very simple mock lens mass distributions, and fit them with exactly the same type of mass models. That way, one is eliminating the possibility that unknown complexities in the lens mass distribution are  Ding et al. 2020) is very useful for that purpose. The goal of TDLMC was to estimate H 0 , not to study the galaxy structure, but their conclusions are still relevant. The analysis was completely blinded.
Of particular interest are Rung 1 and 2 of TDLMC, where the challenge lenses, created by the "Evil" Team, and unknown to the modelers, or "Good" Teams, were elliptical power laws with density slope within 0.1 of isothermal, i.e., ρ 3D ∝ r −2±0.1 , magnitude of the external shear was γ ≤ 0.05, and external constant κ ≤ |0.025|.
As it turned out, three of the Good Teams used the same parametric assumptions to model the lenses as the Evil Team's design of the true ones: elliptical power laws with external shear. Furthermore, the method of calculating the velocity dispersion using the spherical Jeans equation, and the parametrization of the velocity anisotropy were exactly the same for the Evil Team as well the Good Teams that used kinematics.
The Evil Team set the goals for the recovered accuracy and precision of H 0 at 2% and 6%, respectively. The results are summarized in Table 2, where the first three columns give the names of the Good Teams and the data they used in their modeling, in addition to that from the quasar point images. The table shows the magnitude of the accuracy for Rung 1 and 2 reconstructions, and highlights in bold values that were within the stated goal of the Challenge. Columns labeled "tally" give n/m, where m is the total number of reconstructions submitted by that Good Team, and n is the number of those that satisfied the accuracy and precision goal.
Overall, Team EPFL, which used both extended ring and lens velocity dispersion information performed the best, but they failed to meet the goal of the simplest Rung 1. The other two Good Teams, one not using kinematics, the other not using extended ring, did not do much worse. (In Rung 3, which used lenses extracted from Illustris simulation, and suffered from some issues, neither EPFL not Student-T met the goal of the Challenge.) Based on Rungs 1 and 2 we conclude that even if the true and the fitted lens density profiles have the same simple functional form, knowledge of the lens velocity dispersion and/or the extended ring does not always guarantee accuracy or precision, or give a team a definitive advantage. In the real universe, the true mass distribution is not known, complicating the situation, and possibly rendering kinematics and rings even less useful. However, in the future, spatially well resolved kinematic data may prove to be valuable.
Finally, in the case of WFI2033 there is possible evidence that the assumptions going into the kinematics modeling and/or the extended ring are in tension with the astrometric data from quasar point sources. The best fitting models of Rusu et al. (2020), who use kinematics and extended ring, have image rms of 0.01" − 0.025", compared to HST astrometric precision of ∼ 0.003". At the same time their modeled lens mass center is offset from the observed light center by 0.02" − 0.03", while the astrometric precision is ten times better, 0.002".
Therefore we use image positions, time delays, and in separate modeling also include flux ratios that are immune to microlensing. We do not use the extended ring or stellar kinematics.

LENS PLANE MASS MODELS
3.1. Galaxy model As discussed above, the galaxy model presented in Rusu et al. (2020) does not account well for the positions of the four quasar images or the center of the lens galaxy. Our motivation for using a more complex model is to fit the images better, and to do so with models whose mass and light centers coincide.
As argued in the Introduction, there is evidence that lensing galaxies' mass distribution may be lopsided, which leads us to use two superimposed, but off-centered mass components. Both are individually modelled by alphapot analytical lensing potential (Keeton 2011), The five free parameters in eq. (1) capture the shape, size, orientation, and slope of each component: b fixes the mass normalization, s is the core radius size, q and K together define orientation and eccentricity, and α is the potential profile slope. The position angle θ of each alphapot component is given by and the axis ratio, Q is which depends on both q and K, through eq. (2). Though the models we use are simply parametrized, the inclusion of non-zero offset between the two mass components makes these considerably more flexible than the standard parametric models, especially in the azimuthal direction. In addition, HST data reveals that the lens is part of a larger galaxy group at z grp = 0.6588, with velocity dispersion σ = 500 ± 80 km s −1 , and with at least 6 member galaxies within 2 arcseconds of the main lens (Wilson et al. 2016). Of the nearby perturbers, other authors have concluded that G2, a bright galaxy at z = 0.7450 with log M * ∼ 11.15, is the only galaxy with significant impact on the lens modeling (Rusu et al. 2020). Our reconstructions assume a single lens plane, which contains the main lensing galaxy, as well as G2. Had we placed G2 at its actual, somewhat higher redshift, its effective location on the plane of the sky would be slightly different, and its surface mass density would be increased by about 7%. Galaxy G2 looks round, so we model it as a circularly symmetric alphapot potential with an isothermal density slope, a negligibly small core, and Einstein radius of 1", very similar to the estimate of Rusu et al. (2020). These authors allow their modeling procedure to optimize the ellipticity and position angle of G2. In Section 5.5 we will discuss what effect the differences in how to model G2 have on the conclusions.
The main lensing galaxy in WFI2033 lives in a rather complex local environment: there are two nearby galaxy groups (Sluse et al. 2019), so G2 is not the only nearby perturber. In this paper we account for the combined influence of other galaxies with a single external shear. The magnitude of the shear is quite large, γ ∼ 0.1 − 0.2, indicating that the perturbers have a significant influence on the lens. A single shear is only an approximation because the actual sky-projected distribution of mass due to nearby galaxies cannot be represented by a single shear axis. Quantifying the limitations of the single shear approach will be an interesting exercise, but is beyond the scope of this paper. We note that the other two models we discuss, Rusu et al. (2020) and Denzel et al. (2020), also use single shear to model the lens environment and can be directly compared to ours in that regard.

Model fitting
Having established the general form of our lens models, we proceed to find good parametric fits to observational data.
Each of the two alphapot components described by eq. (1) is specified by a set of 5 parameters. Thus, a complete mass model consists of 10 free variables and an extra coordinate pair (x off , y off ) describing the offset of the secondary with respect to the primary mass component. In addition, each model has an associated external shear magnitude and position angle γ ext and θ γ , and a coordinate pair for the source position. In total, these constitute 16 free parameters per model. To obtain good fits to observational data, we proceed in two steps. First, the model parameters are chosen randomly from reasonable ranges that restrict the ellipticity parameters from deviating far from circular and delimit the maximum core radius sizes of the compact (centered on the light center) and extended (allowed non-zero offset) mass components to at around 20% and 90%, respectively, of the smallest image radius. The limits on the offset coordinates, (x off , y off ) were chosen to be small, ±0.02", similar to the offset between mass and light centroids in the Rusu et al.
(2020) models. These ranges are shown as dashed boxes in Figure 7 in Appendix A. Although these initialization values are to some extent arbitrary, we echo the conclusion of Williams & Zegeye (2020) in that parameters are able to deviate well beyond these initial regions after further optimization in the second step (see below). Using each of these randomly generated models we then ray trace the observed images back to the source plane; models are kept or discarded depending on the degree to which their backprojected images converge on the source plane. In this first step we chose to backproject images to the source plane instead of forward lensing sources because the former is computationally cheaper than the latter. Working in the source plane at this stage is sufficient, since we are looking for approximate solutions only. Their lens plane positional rms, defined below by eq. 4, is not very good, typically χ pos 50. It is relatively easy to find many distinct fits. We generate 100 of these models. The average source plane (x s , y s ) position of the four backprojected images per model, together with the 14 lens model parameters are then used in the second step.
Next, we use the Downhill Simplex method (Nelder & Mead 1965) to find good solutions, defined by the merit function, χ 2 , which is calculated as a sum of the following contributions.
where subscripts o and m stand for observed and model predicted values. Image coordinates are (x i , y i ) for images i=1,... 4, time delay ratios are r i , for image pairs i=1, 2, flux ratios are f i , for image pairs i=1, 2, 3, and σ's are their associated uncertainties. The image-positional σ x = σ y = 0.002". The values and uncertainties on time delay ratios and flux ratios are given in Section 2.
The location of the model center of mass is denoted by (x c,m , y c,m ), with respect to the center of light of the lens, and σ c = 0.002" is the separation between the lens galaxy light centers in HST F814W and F160W filters (Rusu et al. 2020). χ 2 cen penalizes the separation between the model center of mass, (x c,m , y c,m ), and the center of light, which is set at the origin. Note that the coordinate pair (x c,m , y c,m ) is not the same as the center of the secondary mass component, but is a function of all lens model parameters.
As described in Section 2.1 and Table 1 we carry out three levels, or Groups of reconstructions. All three of our Groups use χ 2 pos and χ 2 cen . In addition, Group B uses χ 2 td , and Group C uses all four. These amount to n =10, 12, and 15 independent data constraints, for the three groups, respectively. For each case, the total χ 2 is given by the sum of the corresponding contributions divided by the number of independent data constraints n.
The Downhill Simplex search works as follows. Each of the 100 models found in the first step is thought of as a point in the 16-dimensional space spanned by our parameters. In turn, a pair of points define opposite vertices of a 17-sided simplex in the parameter space. By selecting many random pairings among the 100 models, we construct simplices which shrink and move across the  Table 1), with κ = 0.6, 0.75, 1(thick), 1.5, 2 contours shown in black. Images are ordered with respect to arrival time. Galaxy G2 is located at 235 • , measured counterclockwise from the positive x−axis. In standard ΛCDM, 1" at the redshift of the lens corresponds to 6.96 kpc. space until a certain degree of convergence is reached around a local minimum. Though not every minimum represents a good solution, this optimization allows us to find many local minima with χ ≤ 1. These correspond to models that reproduce all observables, on average, to within experimental precision. We do not find solutions with negligibly small χ. This is probably a limitation imposed by our parameterization.
We obtain 50 models with χ ≤ 1 in each of Groups A, B and C, the average convergences for which are shown in Figure 1.

DELAY RATIOS AND FLUX RATIOS
One of our goals is to determine how additional data from the point images contributes towards constraining the mass models of the lens galaxy. Figure 1 shows the mass maps for the 3 Groups of models, summarized in Table 1. The maps include the contribution from galaxy G2, located about 4" from the center, at 235 • measured counterclockwise from the positive x−axis. Including the two time delay ratios as well as image positions, i.e., comparing Groups A and B, does not appear to change the resulting maps. Measuring time delays for multiply imaged quasars is a resource-intensive task. In the case of WFI2033 it took 14 years on more than one 1-3 meter class telescopes to extract precision of 2 − 6% (Bonvin et al. 2019). These results show that at least within the context of our models increased precision may not help in further constraining the mass distribution of this galaxy. 1 Another way to compare the mass distributions is presented in Figure 2, which shows density profiles, averaged over 30 • circle arcs (or wedges, or 'pie slices') and over 50 models. These circle arcs have their apex at the galaxy center, and open up in the directions to the 4 images. In other words, we are taking the part of the density distribution that lies along the line connecting the lens center and each image. The Group A (black) and Group B (red) profiles are nearly identical, and well within 1σ of each other.
The last panel of Figure 1, Group C, shows that adding three image flux ratios does significantly impact the solutions. The most notable change is the steeper density profile slope, which is also evident in Figure 2, where the 1σ uncertainties show that Group C reconstructions are different from those of A and B. It is not entirely surprising that image magnifications can constrain profile slope. Magnification is image stretching in the radial and tangential directions. Stretching in the tangential direction depends mostly on the image distance from the lens center, while that in the radial direction depends on the profile slope. This result is also consistent with the findings of O'Riordan et al. (2019), who use circular power law lenses to show that image flux ratio is critical in determining the lens' density slope.
The 4 panels of Figure 2 also capture the elongation of the isodensity contours in each of the image directions, and so encode information about the ellipticity of the models. Because there is no significant difference between the black, red, and blue lines in each of the panels, the ellipticity is not affected by the inclusion of flux ratios.
Therefore, at least for the family of models considered here, image flux ratios, if due to the macromodel instead of small ΛCDM substructure, are significantly more informative than time delay ratios, especially in determining the density profile slope.

COMPARISON WITH OTHER PUBLISHED MODELS
We now compare our models to other published results: the free-form glass models presented in Denzel et al. (2021), and the singular power-law elliptical mass distribution (SPEMD) parametric models presented in H0LiCOW XII (Rusu et al. 2020). 2 The SPEMD mass distribution is not explicitly plotted in that paper, but we reconstruct it using the parameters in their Table 1. The recovered glass and SPEMD mass distributions are shown in Figure 3, alongside our own Group B results. We chose Group B here because it uses the same point image data as the glass and SPEMD models. The κ = 1 contour is shown in black in all three panels.  Figure 4 shows glass (blue), SPEMD (red), and our Groups B (green) and C (black). Group A is not shown here because it is very similar to B. The four panels show density profiles averaged over 30 • circle arcs and 50 models, with the apex at the lens center and opening towards the four images. Most of these profiles do not look the same. For a given lens system, like WFI2033, mass distributions from independent models can look different due to differences in input data and lensing degeneracies. Various degeneracies can be present or suppressed depending on model priors used in the reconstruction.
We already noted that the difference in Group B vs. C is entirely due to the inclusion of additional data in the latter; the modeling assumptions are the same for both. SPEMD differ from other models most likely because they are constrained by the model priors to be power laws, thereby suppressing degeneracies that would result in other profile shapes.
The most interesting comparison is between glass and Group B models, both of which have nearly identical data inputs. Despite having quite different modeling approaches, pixelated free-form vs. two component parametric model, their density profiles interior to the image radius are very similar in all 4 panels. This is encouraging because it means that if the same data is used in combination with flexible mass models then the radial lensing degeneracies can be largely broken.
For the models that differ, it is interesting to consider what type of lensing transformation connects their density profiles. The shapes of the profiles indicate that the transformation that would map any one of these profiles into another is not the mass sheet degeneracy (MSD). For example, applying a MSD transformation to the power law H0LiCOW model, one obtains a profile with lower central normalization, and profile shape which is convex, especially at larger radii. But the glass and our models, whose central normalization is lower than that of H0LiCOW, have concave outer profile shapes. Therefore transformations mapping these model profiles onto each other belong to a different family of source position transformations (Schneider & Sluse 2014, SPT), not MSD. This means that MSD transformation is not the only radial degeneracy one should consider when modeling lenses. Our models, as well as those from glass, do not have that level of demagnification. For our Group C models, it is typically 0.013, while the smallest is 1.8 × 10 −3 . However, all models presented here can be made compatible with a highly demagnified central image by reshaping the central mass distribution in a monopole-like fashion (Saha 2000;Liesenborgs & De Rijcke 2012), without altering model predictions for the four observed images. Starting with the average profile of Group C, and then shaving off 0.13% (0.6%) of the surface density from the annulus interior to the images (0.1"-1"), and using it to form a flat density core in the inner 100 pc (0.0144") will demagnify the central image to 10 −3 (10 −4 ) compared to the brightest, second arriving image.
To demagnify to the level of 10 −3 would make the central slope outside of the 100 pc core somewhat steeper, but still shallower than in the outer regions, preserving the concave nature of the profile. Adding a supermassive back hole (SMBH), which must be present, would further demagnify the image, often rendering it invisible (Quinn et al. 2016).

Azimuthal mass distribution and lopsidedness
Given the results of Rusu et al. (2020), it appears to be impossible to fit the quad point images with an elliptical galaxy mass distribution, whether it consists of one a) b) c) Fig. 3.-Upper panels: Average convergence plots for three mass models, with κ = 1 contours shown in black. a) Free-form glass, b) our Group B models, c) H0LiCOW SPEMD model. Images are numbered in order of arrival time. Galaxy G2 is located at 235 • , measured counterclockwise from the positive x−axis. Lower panels: The density profiles for the 3 models. For each model we plot profiles in 30 • circle arcs centered on the lens mass center and opening towards the 4 images. Our Group B and glass profiles are the mean of the logarithm of the density profiles of 50 and 1000 models, respectively, and the error-bars represent the rms scatter between models. For SPEMD the profiles and error-bars are from the Rusu et al. (2020) best fit model from their Table 1. or two co-centered mass components, even if the mass center is allowed to be displaced from the light center, and G2 is allowed to have flexible ellipticity and position angle. Some additional modeling freedom in the azimuthal direction appears to be required. For example, the glass models, which reproduce point image properties exactly, deviate significantly from azimuthal elliptical symmetry.
Our models allow extra azimuthal freedom in the form of offset centers. However, unlike the glass models, the deviation from ellipticity in all three sets of our models are small (Figure 1). Without the contribution from G2, fractional deviations from inversion symmetry inside the Einstein radius are of order few percent. Apparently, in the case of WFI2033 even such small deviations are sufficient for the purpose of forcing the mass and light centroids of the main lens to coincide, and the image positions, time delay ratios and flux ratios to be fit within observational errors.
Since the degree of lopsidedness is small, it is possible that LoS can be partly responsible, as it is not expected to be distributed elliptically symmetrically with respect to the lens center. In fact, LoS structure can result in lopsided mass distribution, with fractional mass excess of 1% to a few percent (Vegetti et al. 2014;Gilman et al. 2019Gilman et al. , 2020b. However, the fraction of lensing lines of sight that contain massive LoS substructure ( 10 9 M ⊙ ) is about 10% (Graus et al. 2018;Vegetti et al. 2014). Unfortunately, there is no robust way to know if a small fraction of lensing mass is part of the LoS or the main lens galaxy. The former would result in non-diagonal magnification matrix, but applying that test to observations would be difficult (ÇaǧanŞengül et al. 2020).
While we do not know the origin of lopsidedness in WFI2033, for lenses with low lens and source redshifts, like iPTF16geu, with z l = 0.22 and z s = 0.41 (Goobar et al. 2017), LoS is unlikely to be responsible, and the lopsidedness found in Williams & Zegeye (2020) is most likely intrinsic to the lens plane.

Lensing degeneracies involving external shear
A qualitative comparison between the three models in Figure 3 reveals that the orientations of the mass distributions are approximately the same, elongated towards the last arriving saddle image. Small differences in the position angle can be attributed to the degeneracy between ellipticity and external shear. This well known degeneracy is easy to understand because both ellipticity and shear result in similar effects on the image positions around the lens center.
The lesser appreciated degeneracy is between the magnitude of external shear and the slope of the lens galaxy density profile. Figure 5 shows that for our Group C models (black points) shallower density profiles require less external shear to reproduce the images. The potential slope α ′ in that Figure corresponds to the superposition of the two mass components. The same relation applies to our Group A and B models (green and magenta best-fit lines, respectively), and to the SPEMD models To avoid overcrowding only Group C has its error-bar plotted; the error-bars for other models are shown in Figure 2 and the lower panels of Figure 3.
(red and blue points), but with a different normalization. The reasons for the difference in amplitude of the relation is not clear; it could be due to a combination of different modeling priors and different data sets. Here, we will concentrate on the slope of the relation.
The anti-correlation between α and γ can be understood using a simplified model consisting of a circular power law lensing potential with external shear, Ψ = br α + 1 2 γr 2 cos 2(θ − θ 0 ), with θ 0 = 0, and source which is on the same axis as the shear, at distance x β from the center of the lens. Let the two images form at x 1 and x 2 . The lens equation, x − x β = ∂Ψ/∂x can be written for the two images as, − γx 2 . The radial separation between the two images in this simplified case is where we have assumed that the dependence on α in the exponent in the lens equation can be neglected because α is close to 1. One of the main effects of shear is to increase this separation between images. Assuming that the separation in doubles and quads behaves similarly, we can obtain the dependence on α and γ from ∆|x|. Since x β is always small compared to the Einstein radius, b, and (1 − γ 2 ) ≈ 1, we obtain that ∆|x| ∝ αγ. (Recall that projected mass density scales as ρ 2D ∝ r α−2 .) Because all viable models have to maintain the same radial separation between images, we expect that the radial slope of ′ Fig. 5.-The magnitude of external shear vs. the mean lensing potential slope α ′ (not the same as α in eq. 1) of the main lensing galaxy in the annulus centered on the average image radius, 1.16", and of width ±25%. Recall that ρ 2D ∝ r α ′ −2 . The black points represent 50 models of our Group C. The red rhombus is the main SPEMD model from Rusu et al. (2020), while the blue triangles are the other SPEMD models from their Table C1. glass model is off the limits of the plot at |γ| = 0.225 and α ′ = 0.37. Best-fit lines for our models from Group A and B are shown in green and magenta, respectively. There is considerable scatter around the lines, so we do not show the individual points to avoid overcrowding. Overall, shallower density profiles require less external shear. the potential, α, will be anti-correlated with the magnitude of external shear, γ. This is what is seen in Figure 5, where each set of solutions using the same data and the same priors are described by this anti-correlation.

Comparison in model space
In the previous subsections we compared specific physical properties of galaxy models reconstructed by different techniques. Here we ask a somewhat different question, aimed at techniques capable of exploring large regions of model space by virtue of having more parameters than observables: is there any overlap between solutions of methods that have very different assumptions but use the same data?
Traditional parametric methods, like those in Rusu et al. (2020), are not useful for this purpose, because their models have more observables than parameters, and hence obtain only one solution, with statistical uncertainties. This analysis is confined to our models and glass. (An additional reason to restrict our comparison to these two methods is that these reproduce all the properties of the quasar point images to observational precision.) For convenience, we do the comparison only at the locations corresponding to glass pixel centers. We discretized our mass distributions into a grid with the same number of data locations as the glass grid. This amounts to about n ∼ 150 grid locations over the whole lens plane of radius 1.478 ′′ , which contains the largest image radius, 1.472 ′′ .
To compare any two models, we arrange their κ values into two n-dimensional vectors x and y, and evaluate their "similarness" by computing χ = | x − y|/| x + y|, which compares the lengths of the difference and sum of vectors x and y. This was done for each of the 50 × 1000 pairs of our models B, and C vs. glass, the same number of random pairings of glass vs. glass models, and 50 × 50 pairs of B vs. B, and C vs. C models. The distributions resulting from each of these five tests were very close to Gaussians (probably because of the central limit theorem), so in Fig. 6 we just plot the five fitted Gaussians. (Results for A models are not shown because they are very similar to those of B.) Let us first consider B vs. B distribution, represented by the cyan curve. The distribution has some overlap with 0, implying that most B models are quite similar to each other. A pair of identical models has χ = 0. The most dissimilar ones differ from each other by a fractional difference of only ∼ 5 − 7%, therefore in n−dimensional space these form a tight cloud of models. Models C (red) show similar traits, but these form a somewhat larger cloud. glass (green) forms a more dispersed cloud, with models typically separated by a fractional difference of ∼ 11%. The fact that the mean of the glass distribution is larger that that of the B distribution is not surprizing, since the former a free-form method, while the latter is parametric. The narrowness of glass distribution means that the models are approximately equidistant from each other.
The main question posed in this subsection is answered by the blue and gold curves. B and glass models (blue) are typically separated by the same distance as two models from the glass ensemble. Combined with the tightness of the B vs. B model cloud, this suggests that B models occupy a region of model space close to the glass cloud, and that there is likely a small overlap between the two. However, a more in-depth analysis, for example using Principle Components (PCA) would need to be carried out for a more thorough understanding of the relative distributions of the models in these highdimensional spaces. This is consistent with our earlier finding that B and glass models have a very similar radial density profile (see Section 5.1.1 and Figure 4).
We conclude that methods using very different parametrizations of the lensing mass-like ours vs. glass-if they allow for degenerate mass models and use the same data, can converge on similar solutions, i.e. their solutions overlap in model space.
If the data is different, as is the case with our models C vs. glass (gold curve), then the resulting solutions are different, with little or no overlap. 5.5. Assigning lensing mass to astrophysical entities: interpretation degeneracy Lensing responds to all the mass between the source and the observer, whether it belongs to the main lensing galaxy, other nearby galaxies, or LoS structures. Freeform modeling, like glass, takes advantage of that, and is completely agnostic to the physical origin of the deflecting mass. The downside is that it makes it harder to draw conclusions about the mass distribution in the  Figure 3 shows two main deviations from ellipticity: the extension of density contours towards the lower left, which is most apparent around the image radius, and the extension towards the upper left, within 1-2 pixels from the center. The former must be, at least partially due to the mass of G2, which is located in that direction. The latter could be due to the asymmetry in the main lensing galaxy, or due to G2, or both. The contribution from the LoS structures is also probably present at some level.
While FF modeling provides the clearest example of our inability to uniquely and correctly interpret how to assign mass to various astrophysical entities, this 'interpretation degeneracy' is not limited to FF modeling. In the SP approach, even if the modeler intended that a certain fraction of the total projected mass belongs, say, to the secondary galaxy, it maybe well be part of the main galaxy, or vise versa. For example, in SPEMD models the orientation of the ellipticity of G2 is such that it contributes more mass to the lower portion of the quad (in the orientation shown in Figure 3), around the second arriving image, than a circular G2 would. It is possible, however, that this mass asymmetry due to the model ellipticity of G2 should be attributed to the main lens, or even LoS mass.
In our own models, as we already mentioned, the physical origin of the offset mass component, or part thereof, is unclear. It could be intrinsic to the main lens, or, since WFI2033's lens and source are high, some of it could arise from LoS structures.
While the most problematic consequence of misassigning mass to the wrong entity is for the physical understanding of the lensing galaxy, other aspects of lensing related analysis may be affected as well. When singleaperture stellar kinematics are used, interpretation degeneracy can make the analysis less self-consistent, because the Jeans equation combines the radial mass distribution, some of which may be incorrectly attributed to the main lensing galaxy, with the velocity dispersion, which is measured from the main galaxy only.
It is possible that some of these issues will be alleviated with the use of spatially resolved stellar kinematics (Yıldırım et al. 2020).

CONCLUSIONS
Most recent papers modeling galaxy-scale strong lensing are focused on estimating the Hubble parameter, H 0 , because of unresolved and tantalizing discrepancy between the values derived from various independent probes (e.g., Verde et al. 2019;Riess 2020). This puzzle will ultimately be resolved, and the goal of galaxy-scale lens modeling will shift back to the detailed structure and evolution of the lensing galaxies.
Our motivation for choosing WFI2033 is that it is one of six H0LiCOW systems, and its quasar point images and lens light center are poorly fit by their power law and composite models. This can be interpreted as evidence that the combination of quasar image data, extended ring data and galaxy velocity dispersion are not compatible with the power law, or composite profiles. In the interest of working around some of the shortfalls of this modelling strategy, we introduce more flexible models, which offer more freedom while still being built out of astrophysically motivated components. We do not include extended ring image and stellar velocity dispersion as constraints.
Our goal is to determine what can currently be said about the structure of the lens galaxy in WFI2033. Aware of the fact that conclusions about mass distribution can depend on the lens modeling assumptions, we examine models derived from three independent modeling techniques, simply parametrized (SP), pixelated freeform (FF), and our own models that bridge the gap between the SP and FF modeling philosophies. Our use of two superimposed simply parametrized lensing potentials is similar to the SP approach, but the freedom afforded by our use of offset centers, and, as importantly, by the fact that we generate many models, instead of one, is reminiscent of the FF approach. (We note that our way of parametrizing the lens mass is not unique, and other parametrizations could give rise to somewhat different models.) We present 3 groups of models, sequentially including constraints due to quasar image positions (Group A), time delay ratios (Group B), and flux ratios (Group C), assuming they are due to macro model (their microlensing origin is already ruled out). We find that time delay ratios in WFI2033 do not constrain models any more than image positions already do, while flux ratios make the central density slope significantly steeper, but still shallower than isothermal.
Despite the emphasis placed in the literature on the mass sheet degeneracy and the corresponding transformation, we show that the transformations connecting SP, FF and our models are not MSD, but more generalized source position transformations, SPT (Schneider & Sluse 2014). In the absence of lensing constraints in the central ∼ 100 pc, the current models have a wide range of central densities. This is not a real disagreement because the mass in the central regions can be reshaped using monopole transformations, without affecting the fit to the four images of the quad. In the future, the detection of, or an upper limit on the flux of the central odd image will help constrain the central density and hence the density profile shape.
In addition to the well known ellipticity-shear degeneracy, the models we examine manifest a degeneracy between the magnitude of external shear and the lens profile slope. Each set of models, for example, SPEMD, or our Group C, show an anti-correlation between these quantities, but the normalization of the anti-correlation differs between SPEMD, Group C, etc., for reasons that are not yet clear but could depend on the data sets used, or model priors.
Having examined various aspects of the three different sets of models of WFI2033-power laws, pixelated free-form, and our own flexible parametrized modelswe conclude that there are as many similarities as there are differences between them. The similarities are mostly in the azimuthal structure of the lensing mass: though details differ, all models agree that the mass distribution is elliptical, their position angles are within ∼ 30 • of each other, and that there has to be significant external shear, γ 0.1. There also has to be mass extension beyond ellipticity in the direction between the 2nd and 4th arriving images. There is less agreement between models on the density profile slope, both at the center and around the image annulus. The only exception is that of glass and our Group B models, whose radial profiles look similar, possibly because they share the same data constraints and considerable modeling freedom. Future observations, like spatially well resolved line of sight velocity dispersion of the lens galaxy, and limits on the flux of the central odd image will help better constrain the lensing mass distribution and determine its astrophysical origin.
The three sets of our models, Groups A, B and C require the lens mass distribution to be lopsided. The excess mass arising from the lopsidedness (compared to inversion symmetric, or, SP models) is only a few percent. Therefore we find that in WFI2033 it does not take much lopsidedness to generate χ 2 ≤ 1 models, i.e., to make predicted image positions, time delay ratios and flux ratios to be consistent with observational uncertainties, and to make galaxy light center agree with the mass center. The astrophysical origin of lopsidedness in the case of WFI2033 is not clear: it could arise in the main lensing galaxy, or be contributed by nearby galaxies, or the line of sight structures. This inability to correctly assign mass to its astrophysical origin may not affect the estimation of H 0 , but it does affect the interpretation of galaxy structure. For example, if the lopsidedness does arise in the main lens then this z = 0.66 galaxy is not well relaxed in the innermost few percent of its virial radius. This is an example of what we call interpretation degeneracy. As opposed to the lensing degeneracies, which allow different mass models to reproduce the same lensing observables, the interpretation degeneracy, which stems from the fact that lensing responds to all mass, applies to any given mass model. It makes it difficult to determine how much of the model's mass is due to the main lensing galaxy vs. other nearby galaxies vs. line of sight structure. Interpretation degeneracy is not a new notion in lens modeling (e.g., Trotter et al. 2000), and will likely become more important when the main goal of modeling becomes the detailed structure of intermediate redshift galaxies.
The authors would like to thank Prasenjit Saha for suggesting tests that extended the paper's analysis into new directions and the REU program at the University of Minnesota, supported by grant NSF-1757388. Figure 7 shows parameter ranges for 50 models from Group C. The dashed black boxes correspond to the initialization ranges, which were the same for all three Groups A, B and C. Even though these ranges are somewhat arbitrary, the dependence on them is partly erased in the final models because parameter optimization using downhill simplex can wander outside of the initial ranges. In fact, in the case of the offset between the center of the two components (upper left panel), the bulk of the final models are outside of the initial box. That box was made purposefully small around (0, 0) to force the two mass components of the model to have coincident centers as much as possible. However, models appear to prefer r off 0.025", or 0.174 kpc. Source positions are also shown in the last panel, along quasar image positions, depicted by red circles. -Final parameters for the 50 models of Group C; see eq. 1-3. The dashed boxes show the limits within which initial parameters were randomly scattered for all three Groups. The dashed box in the upper left panel is rotated because the Cartesian coordinate system where calculations were carried out was rotated with respect to the orientation of figures in this paper.