Two-component mass models of the lensing galaxy in the quadruply imaged supernova iPTF16geu

The first resolved, multiply imaged supernova Type Ia, iPTF16geu, was observed 4 years ago, five decades after such systems were first envisioned. Because of the unique properties of the source, these systems hold a lot of promise for the study of galaxy structure and cosmological parameters. However, this very first example presented modelers with a few puzzles. It was expected that to explain image fluxes a contribution from microlensing by stars would be required, but to accommodate the magnitude of microlensing, the density slope of the elliptical power law lens model had to be quite shallow, $\rho_{2D} \propto r^{-0.7}$. Furthermore, the center of mass had to be displaced from that of observed light by ~0.1 kpc, and the position angle of light distribution was misaligned with that of mass by ~40 degrees. In this paper we present mass models that resolve the first two problems, and suggest a resolution of the third. Motivated by observations of local ellipticals, and some recent analysis of galaxy-scale lenses, our mass models consist of two offset (baryonic) mass components. The resulting mass distributions have a single centroid, but are lopsided, and have isodensity contours that are not purely elliptical and not self-similar with radius. For many of our models the microlensing requirements are modest, and the ring formed by the extended supernova host galaxy resembles the observed one.


INTRODUCTION
The pioneering work of Refsdal (1964) described a new method to measure the Hubble constant, using a phenomenon-gravitational lensing-that would not be observed for another 15 years (Walsh et al. 1979). Though the lensed source was envisioned to be a supernova, multiply imaged supernova were not detected until only a few years ago (Quimby et al. 2014;Kelly et al. 2015;Goobar et al. 2017).
Supernova iPTF16geu was the first Type Ia supernova with resolved multiple images (Goobar et al. 2017). The source and lens are at redshifts z = 0.409 and z = 0.2163, respectively. Using supernova lightcurve fitting, Dhawan et al. (2020) estimate that the total magnification is µ = 67.8 ± 2.7 at 68% confidence level. Microlensing by stars in the lensing galaxy was recognized to be im-llrw@umn.edu (LLRW) portant in this system since it was first detected (Foxley-Marrable et al. 2018;More et al. 2017). However, it was soon realized that the combination of proposed macrolensing (i.e., due to the main lensing galaxy) and microlensing models are not able to fully explain the system. Specifically, assuming that the galaxy lens is described by a power law projected density profile with an "isothermal" slope, Σ ∝ r −1 , would require a considerable, even unrealistic amount of microlensing (de)magnification of the images (Yahalomi et al. 2017). Varying the fraction of dark matter compared to total mass, or including external shear does not help (Yahalomi et al. 2017).
This conclusion suggests that an isothermal power law lens is unlikely. Mörtsell et al. (2019) considered a range of density profile slopes, Σ ∝ r α−2 , and conclude that a shallower galaxy lens will result in predicted macromagnifications that will not put too much strain on microlensing contribution. The best fitting slope from microlensing considerations is α ≈ 1.3, significantly shallower than isothermal (α = 1). Such shallow slopes are far from SLACS findings (Koopmans et al. 2006), of α = 0.99 ± 0.12, but in the range of profile slopes found in Read et al. (2007) for lensing galaxies.
While the central regions of some galaxies may have shallow density profiles, the profiles must steepen at larger radii. In addition to this, a further feature of their model suggests that it might be a simplification of the actual lens: the center of observed light is displaced from the center of recovered mass by ∼ 0.03 , or about 0.1kpc, much larger than the astrometric uncertainties of 0.002 . Finally, the position angle of the fitted elliptical mass distribution is misaligned with the observed light orientation by ∼ 40 • .
Taken together, these properties of the lens model lead us to conclude that the actual mass distribution is more complicated than an elliptical density power law. What form could these complications take? Addition of dark matter substructure, as predicted by ΛCDM cosmological model (Klypin et al. 1999;Moore et al. 1999) is unlikely to reconcile the simple mass model with observations because substructure cannot change profile slope, or affect image positions significantly enough to reconcile light and mass centroids. Microlensing by stars can significantly affect image fluxes, but not image positions. The additional complications must be due to the macroscale mass model of the galaxy lens. Wagner (2018) has shown that images provide only local information about the lensing potential; the rest is usually filled in by model assumptions. This motivates us to explore other possibilities for what projected lens looks like in the case of iPTF16geu. Specifically, we extend the published lens models beyond a single power law; our models consist of two superimposed, cored power law mass components, each with its own set of parameters ( § 2). Our recovered mass distributions, generated using the method described in § 3, account for image positions, and produce coincident light and mass centroids within astrometric error ( § 4). We do not use time delays, because of the large uncertainties, or the extended image of the supernova host galaxy, which was found by Mörtsell et al. (2019) not to be very useful. We do compare extended rings and time delays generated by our models with observations, in § 4.3 and § 4.4.

GALAXY MODEL
To motivate our mass model we assume that the lens galaxy at z ≈ 0.2 must be analogous to nearby ellipticals of similar dynamical mass and effective radius of the light distribution, R e . Its velocity dispersion is 129 km/s (Johansson et al. 2020) 1 , and Dhawan et al. (2020) measure R e ≈ 0.55 (depending on the filter), or 1.83 kpc, at the redshift of the lens.
Virgo cluster ellipticals are well studied, and some are analogues to the iPTF16geu lensing galaxy. For example, NGC 4494, has σ ∼ 150 km/s (Romanowsky et al. 2003), and R e ∼ 3.8 kpc (Foster et al. 2011), and NGC 4434 has σ ∼ 100 km/s (Dressler 1984), and R e ∼ 1.5 kpc (Kormendy et al. 2009;Krajnović et al. 2018). Central regions of galaxies inside the light's effective radius are dominated by stars. This is especially true of lower mass ellipticals (Ferreras et al. 2005). In iPTF16geu, quad images are formed at ∼ 1kpc, or ∼ 0.55R e , where the bulk of the mass is in the form of stars. Therefore, to motivate our mass model we need to look at the stellar mass distribution in low mass ellipticals. Dhar & Williams (2012) used data from Kormendy et al. (2009) to fit sky projected radial light distribution of Virgo ellipticals over 4 − 5 decades in radius. A single component, either a power law or a Sersic profile does not provide a good fit, while fits using 2 or 3 projected Einasto components have very small residuals, with rms of 0.025 − 0.065 magnitudes for NGC 4494 and 4434. When fitted with 2 components, the ratio of the two scale-lengths is about 10 − 25, for both of these galaxies. If these Virgo galaxies were the lensing galaxies in iPTF16geu, images would form somewhat interior to the scale-length of the larger component, at the radius where the total light distribution undergoes the most rapid change in (projected) slope.
Our mass models represent the lensing galaxy as a superposition of two mass components, but instead of using projected Einasto profiles, which do not have analytical expressions for lensing potential and deflection angles, we use alphapot analytical lensing potential (Keeton 2011), whose expressions for deflection angles and projected density, or convergence are presented, for example, in Ghosh et al. (2020). Each alphapot is characterized by a center, normalization b, potential profile slope α, core radius s, and ellipticity parameters q, and K: If K = 0, the elongation of the iso-potential contours is along the x or y axis, and q gives the axis ratio. If K = 0, the ellipticity position angle is no longer aligned with the principal axes. Because of adjustable core size and power law density slope, this profile captures the most important feature of Einasto profile over the radial range relevant to lensing, namely the ability of have constant or changing density slope. We allow the centers of the two mass components to be offset from each other, which amounts to having a dipole moment in the mass distribution around the center of light. Gomer & Williams (2018) showed that to reproduce the statistical properties of image distribution around the lens center of the sample of ∼ 40 galaxy-scale quad lenses, one needs to introduce small, dipolar-like azimuthal mass perturbations around the lens center. The authors implemented these by offsetting the centers of the two mass components, while the total mass distribution still had a single centroid. Further evidence for such complexity in mass distribution comes from the modeling presented in Nightingale et al. (2019): all three of their massive elliptical lenses consist of two baryonic components, and show lopsidedness in their mass distribution.
The more compact of the two mass components in our models is designated as the main one, while the secondary is more extended. The two need not have equal mass, or brightness. In fact, the central brightness of the secondary component tends to be fainter by about a magnitude (Dhar & Williams 2012). In principle, the second component in our models could also represent dark matter, with considerable or minimal stellar contribution. For lensing, all that matters is their mass content.

MODEL FITTING
The positions of the 4 images of the quad give us 8 data constraints. We do not use the ring formed by the extended light from the supernova host galaxy, as Mörtsell et al. (2019) found it to be not very useful; the ring did not improve the fit or change if from the one found using point images alone. We do not use time delays either, because they are expected to be short in this cross-type quad, and because of their rather large uncertainties. Later in this Section we present χ 2 values calculated without the time delay contribution, and in § 4.4 we include time delay information, and show that in this system time delays do not help discrimanate between models. We present images of extended host galaxy for some of our final models in § 4.3 and time delays in § 4.4. Our two mass components, A and B, have 5 free parameters each, b, s, q, K, and α, defined in eq. (1). The offset between the two, ∆x, ∆y, adds 2 more free parameters, and 2 additional ones are due to the unknown location of the source. From these 14, we eliminate 2 that correspond to the mass normalization of the lens and its orientation (rotation) on the plane of the sky. In practice that means we set K A = 0, and only the ratio b B /b A matters, not their individual values. In other words, we find fits to the quad characterized by 3 image distance ratios with respect to the lens center, and 3 relative polar image angles. Having found a fit, the rotation of the whole system with respect to the center of light, and the mass normalization are adjusted to correspond to the images of iPTF16geu. Specifically, the model quad system is shrunk or expanded, and rotated, such that the model's first arriving image is exactly at the distance and position angle of the observed first arriving image.
This means that our merit function is where d i is the ratio of the distance of image i from the lens center to the distance of the first arriving image, and similarly for image angle θ i . Subscripts m and o refer to the model and observed image properties, respectively. Uncertainties σ's were obtained from the astrometric uncertainties in Mörtsell et al. (2019). Subscript p on χ 2 means that it is based on positional data only. In § 4.4 we consider time delays as well, which modifies the merit function to where the time delays given in Mörtsell et al. (2019) are with respect to the third arriving image. We search over a ranges of parameters, given in Table 1. Ellipticity parameters were not allowed to be too far from circular, in keeping with the appearance of the light of the lensing galaxy. Additionally, the core radii of the two components, s A and s B , were constrained to be smaller than the image radius. The ratio of normalization parameters, b B /b A was picked from a wide range, while the offset of the second component was confined to about a quarter of the Einstein radius. While these starting parameter ranges are somewhat arbitrary, we note that the second stage of our two-step modeling procedure (see below) allows parameters to wonder well outside of the ranges set in this first step.
We use the following method to search the 12 dimensional parameter space for solutions. Since the model parameters outnumber lensing constraints, we do not expect a unique best fit solution. Instead, many solutions will satisfy the constraints of iPTF16geu. Our modeling consists of two stages. First, we randomly search a wide range of parameters, looking for approximate fits to the lens system. It is not hard to find many fits with χ 2 p < ∼ 9 using a random search. We generated 300 of these. As expected, the distribution of χ 2 p values in this sample is heavily biased towards high values. Then, we use randomly selected pairs of these bad solutions to define pairs of points in the 12 dimensional parameter space. The two solutions in a pair serve as "opposite" vertices of a simplex. In 2D such a simplex will be a square, in 3D, a cube, etc. Each simplex is input into downhill simplex ±0.075" 1-100 0.6-1.4 0.7-1.0 0 0.6-1.4 0.7-1.3 -0.4-0.4 method to find better solutions. Of the 2000 downhill simplex runs, 458, or about 23% had χ 2 p < 1. We checked that these solutions are in fact separate local minima in the 12 dimensional parameter space, because a straight line between any two contains worse fits. Our method, which combines a random search followed by downhill simplex search, is an easily implemented and effective way of finding solutions in multidimensional parameter space that has many local minima. The two steps of our search method resemble those of the particle swarm optimization, though the latter looks for a single solution, while we look for many. Note that there is no need to find the global minimum; all local minima have χ 2 p < 1, and many have χ 2 p < 0.1. Because our mass models contain more parameters than data constraints, we need to check that these extra parameters are statistically justified. To do that, we compare our models to commonly used, single mass component models. To allow for better fits we include external shear, which results in k = 5 total free parameters. In our two-component models k = 12 (see above in this section). When the center of the one-component lens is fixed at the center of the observed light, as it is in the two-component models, a typical χ 2 p ∼ 27. Both models have n = 6 data points (three relative polar image angles and distance ratios, with respect to the lens center). Using Bayesian Information Criterion, BIC = χ 2 + k ln n, we get ≈ 36 for one-component models, and ≈ 22.5 for our two-component models. BIC differences of > 10 are considered very strong evidence for models with smaller BIC values. Thus, two-component models are well justified in this system.
Each solution is a lens model of iPTF16geu, and has associated image magnifications. The two panels of Figure 1 show magnifications of the two images formed at the minima of the arrival time surface (left panel), and two images formed at the saddle points (right panel). We designate the images in two ways: the published papers label them as #1, ...#4, and we also add designation based on the arrival time, as 1st, ..., 4th (Saha & Williams 2003).
We show these 458 solutions as small orange filled points in these plots. (We checked that the distribution of solutions does not depend on their χ 2 p values, so a somewhat different value for the χ 2 p cut would have generated a statistically similar distribution of solutions.) Most of them lie near a straight line, running approximately diagonally from low to high magnifications in each of the panels. The models from published literature are denoted by empty black triangles (Mörtsell et al. 2019, Table 9) and circles (More et al. 2017 , Table 1), and also lie approximately along the same diagonal lines. Solutions along these two lines share similar geometry of their mass distributions, specifically, they are connected by approximate mass sheet degeneracy (MSD), which is more aptly called the steepness degeneracy (Gorenstein et al. 1988;Saha 2000); flatter density profiles correspond to higher image magnifications. Note that MSD here is not exact: rescaling of κ of one solution by some λ and adding a flat mass sheet of density 1 − λ (in units of critical lensing density) will generate a solution which is only approximately similar to the ones plotted here along these diagonals. The actual relation between these solutions (both orange and empty black symbols) are probably better described by non-trivial source plane transformations (SPT; Schneider & Sluse 2014).
Not all of our 458 solutions are equally good representations of galaxies. Because models consist of two offset mass components, many total mass models will have two separated centroids. These are probably unastrophysical, and we do not consider them further. In the remaining ∼ 30% of the cases the offset centers merely result in lopsidedness, or dipole-like feature in the mass distribution. The total mass distribution in these has a single centroid; in other words, the center of light and mass coincide. Aiming to isolate models that are closer to single component ones, we further select solutions where the main (compact) component contains at least half of the total mass within the image ring. We end up with a subset of 57 solutions, plotted as green dots in Figure 1. By selecting these 57, we do not preferentially select better or worse fits: we checked that χ 2 p does not correlate with the fraction of mass in the main component, or how far the second centroid is from the light center.
Though our imposed cuts do not correlate with the goodness of fit, they do correlate with other lens properties. Most of these 57 solutions deviate from the diagonal lines, especially in the right panel, and many of these lie closer to iPTF16geu, than further from it, meaning that these solutions require less microlensing (de)magnification.
We cannot examine all 57 models in detail. Instead, we pick a small handful of examples, based on their magnification properties, as these are important for microlensing. Models M1 and M3 are on opposite sides of the main diagonal in Figure 1, and are closer to the iPTF16geu than other green points within their respective islands of solutions. Additionally, we selected M3 because it is close to iPTF16geu in both panels of Figure 1. Thus M3 would require minimal contribution from microlensing. M2 was selected because it is on the main diagonal. We note that we could have picked a different set of models, or models based on physical criteria other than magnification. But the wide range of models allowed by the data requires us to make choices.

Mass distributions
In this section we examine the three solutions selected in the previous section, and depicted as large green filled symbols in Figure 1. They span a range of magnifications, and have coincident mass and light centroids. We present their mass distributions in Figure 2. The magenta is the κ = 1 contour of the main (compact) mass component, while the green curves represent the isodensity contours of the total mass distribution, logarithmically spaced with an interval of 0.05. The thick green contour denotes κ = 1. The green isodensity contours are not exactly elliptical because they represent a sum of two offset mass components. The fact that the offset results in a dipole-like mass distribution around the lens center is evident in these plots.
Published works agree on the position angle of the recovered mass distribution, depicted with the thin blue line going through the lens center, pointing at image #3 to within a few degrees. This elongation direction can be understood even without modeling, from the consideration of image distribution: image #3 is the 4th arriving, and is usually expected to be closest to the lens center. Since this is not the case here, there has to be mass extention in that direction, to "push" the last arriving saddle further from center. Not surprisingly, our models also show elongation along that axis, though because the total mass distribution is not elliptical, the position angle depends somewhat on the radius.
The axis of the observed light distribution is within the cone outlined by two blue thicker lines, going through the center of the lens in Figure 2. It is not aligned with the elongation of our mass distributions, echoing the conclusion of Mörtsell et al. (2019). We interpret this misalignment as further indication that the mass, and light, distribution in the lensing galaxy cannot be represented by simple mass models. Given that the reason for the orientation of the mass ellipticity (see the preceding paragraph) is largely model-independent, reconciling it with the position angle of the light could require radial variations in the stellar mass-to-light ratio, or local elongations in the light distribution, such as the ones in the green contours in the southern direction of model M3 in Figure 2.
The density profiles along three radial spokes emanating from the lens center are shown in Figure 3. The solid and short-dashed blue lines are profiles taken along the positive and negative directions of the major axis, while the long-dashed line is along the positive direction of the axis perpendicular to the major axis. The major axis was determined as the axis of the largest first moment of the total mass distribution. Two reference slopes, of α = 1.0 and 1.4 are shown for comparison. Image locations are depicted by vertical magenta line segments. At the location of the images the profile slopes range between α ∼ 1.0 and ∼ 1.4, with the average being somewhat steeper than α = 1.3 slope obtained in Mörtsell et al. (2019).
The fact that our mass maps show a variety of isodensity contour shapes means that they are not related by MSD, but instead, by more complex degeneracies that involve transformations of shape (Saha & Williams 2006;Read et al. 2007), like source plane transformations (Schneider & Sluse 2014).

Microlensing
Microlensing (de)magnifications implied by our mass models can be deduced from Figure 1, as the displacement of the magenta point, labeled iPTF16geu, from the three large green dots, labeled M1-M3. One of the fea-tures of our models that make them different from the published ones is that we predict solutions whose macro magnifications do not lie on the same diagonal relation on that figure, as those of ellipsoidal power law density profiles. Most interestingly, there is an island of solutions in the right panel of Figure 1 that is closer to iPTF16geu than ellipsoidal power laws; our model M3 is an example. Of the 3 models we consider, changes in flux due to microlensing are smallest for model M3: < ∼ 0.4 magnitudes for 3 images, and ∼ 0.75 for the brightest, 1st arriving image. Model M1 would require largest microlensing contribution, with the 2nd arriving image requiring a demagnification of ∼ 1.1 magnitudes.
Though our models are different from the elliptical power law models, we agree with the latter that image fluxes cannot be explained without microlensing. Microlensing is almost definitely present in this system, and most likely affects the fluxes of at least three, or all four images. In that sense we agree with the conclusion of Yahalomi et al. (2017) that microlensing makes it hard to accurately determine the standard candle brightness of the supernova, and therefore use it to help break the mass sheet degeneracy in iPTF16geu.

Ring image of the extended galaxy host
Extended images of host galaxies, i.e., rings, have been used extensively in the literature for nearly three decades (e.g., Kochanek et al. 1989Kochanek et al. , 2001; most recently to help constrain H 0 , and to uncover mass substructure in the lens galaxy or along the line of sight (e.g., Vegetti et al. 2012). However, just like point images, they are subject to degeneracies (Saha & Williams 2001;Walls & Williams 2018), and so may be of limited value in some circumstances. The example in Figure 1 of Denzel et al. (2020), which shows two completely different lens mass distributions producing indistinguishable rings, is a sobering reminder that we do not yet have a full understanding of how lensing degeneracies operate.
Rings generated from our M1-M3 mass models are shown in Figure 4. The host galaxy source in all 3 cases was assumed to be circular, with a Gaussian light profile. The center of the host galaxy is offset from the supernova by various amounts, < 0.014 , in these 3 models, comparable to the estimate presented in Mörtsell et al. (2019), ∼ 0.02 . The offset was chosen to better resemble the observed ring, but we did not fine-tune the offset (or other parameters of the source) to produce as a close a match to the ring as possible. Given the appearance of the rings presented in Dhawan et al. (2020), the model that appears closest to observations is M3, because its brightest ring segment is between images #1 and #4. In M1 and M2, on the other hand, the segment between #3 and #4 is equally bright, in contrast to the observations. This suggests that of the 3 models, M3 is the best representation of the lensing galaxy.

Time Delays
We did not consider time delay information when generating our mass models in § 3. Had we included observed time delays the model selection would not have been affected. To check that, we calculate χ 2 for all our 458 models, but now using both versions, eq. 2 and eq. 3. The results are shown in Figure 5. Because of their large  uncertainties, time delays add an approximately constant contribution to each model, and therefore have little discriminating power compared to image positions. Figure 6 shows the distributions of time delays predicted by our lens mass models, orange points in Figure 1. The three histograms are labeled by the corresponding time delay, using the same image naming convention as in Mörtsell et al. (2019). The three sets of three cross symbols show the values from models M1, M2 and M3. The horizontal bars at the top of the plot show Mörtsell et al. (2019) values, for lens potential profile slopes α = 1 and α = 1.4, as labeled. The thin lines show the range of values resulting from the continuous range of α's. While the time delay ranges predicted by our models and these models agree, both are somewhat at odds with observations for τ #1#4 and τ #1#3 , but only at < ∼ 1.5σ, given the uncertainties of ∼ 1 day.

CONCLUSIONS
Multiply imaged supernova, and especially Type Ia's, hold a lot of promise for the study of galaxies and cosmology. They are superior to quasars as sources because their time delays are relatively easy to measure, their absolute fluxes are known, and their light eventually fades, allowing observations of the lens galaxy without the glare of the bright images. All these reasons make the very first multiply imaged supernova, iPTF16geu, very exciting.
As is often the case, the first of its kind object had some surprizes for modelers. If the mass model is assumed to be an elliptical power law, the mass and light centers are displaced by ∼ 0.1kpc, which is unlikely to be the case in an equilibrium galaxy. Furthermore, to accommodate microlensing constraints, the power law density slope has to be rather shallow, and so cannot hold at larger radii.
The main motivation for the present paper was to generate lens models whose mass distribution resembles that  Table 11) for models with potential slopes α, as labeled. The thin horizontal lines connect time delays for the whole range of slopes α = 1 → 1.4. The 1σ uncertainties in the observed time delays are represented by dashed horizontal lines. Note that the central values for τ #1#4 and τ #1#3 are outside the limits of the plot.
of local ellipticals, and where the center of observed light coincides with the center of recovered mass. Similar to local ellipticals whose light is represented by more than one component, our mass models consist of two mass components. The two components are offset, reflecting conclusions of some recent lensing reconstructions. Our final selected models have a single mass peak, and hence coincident mass and light centers. The consequence of using two offset components was to produce dipole-like lopsidedness in the total mass distribution. Our modeling assumptions, namely two offset elliptical potentials, result in a range of mass models-not a single modelall satisfying positional and time delay image information. For illustration purposes we concentrated on three of these models, based on their magnification properties, but could have chosen a different set.
Though our mass distributions are not elliptical power laws, we do agree with the published models on some key features of the lens: (1) To explain the fluxes of the iPTF16geu, microlensing by stars in the lens is required for at least 3, or all 4 images of the quad; (2) The total mass density profile at the location of the images is significantly shallower than isothermal; (3) The position angle of galaxy mass elongation is misaligned with the light, which is probably an indication that the mass, and light, distribution in the lens is not simple.
We note that our two-component models may not be the only way to account for the observations of iPTF16geu. It is possible that the main lensing galaxy has a simple one-component structure, but there are ad-ditional significant secondary galaxies, either at the same redshift or along the line of sight. However, there are no observed nearby galaxies, making this scenario less likely. The line-of-sight structure (LoS) is also unlikely because the source is very nearby, z = 0.409, so the line of sight dimension is short. Despali et al. (2018) estimate that the typical number of LoS subhalos along a given direction is much less than 1, and their mass is a small fraction of the mass enclosed by the images.