Calculation of distances in cosmological models with small-scale inhomogeneities and their use in observational cosmology: a review

The Universe is not completely homogeneous. Even if it is sufficiently so on large scales, it is very inhomogeneous at small scales, and this has an effect on light propagation, so that the distance as a function of redshift, which in many cases is defined via light propagation, can differ from the homogeneous case. Simple models can take this into account. I review the history of this idea, its generalization to a wide variety of cosmological models, analytic solutions of simple models, comparison of such solutions with exact solutions and numerical simulations, applications, simpler analytic approximations to the distance equations, and (for all of these aspects) the related concept of a"Swiss-cheese"universe.


INTRODUCTION
The Universe is not completely homogeneous; if it were, there would be no observers and no objects to be observed. Nevertheless, distances are often calculated as a function of redshift as if that were the case, at least as far as light propagation is concerned. Whether this is a good approximation depends at least on the angular scale involved. The simplest more refined model retains the background geometry and expansion history of a Friedmann-Robertson-Walker (FRW) model but separates matter into two components, one smoothly distributed comprising the fraction η of the total density and the other (1 − η) consisting of clumps, and considers the case where light from a distance object propagates far from all clumps (this is equivalent to the case of negligible shear). Over a period of more than 50 years, various authors have described more-general versions of this approximation with regard to cosmology, found analytic solutions, discussed similar approximations, compared it with exact solutions and with brute-force numerical integration based on the gravitational deflection of matter along and near the line of sight, examined the assumptions involved, applied it to various cosmological and astrophysical problems, and developed simple analytic approximations both for more-exact solutions (the latter based on more-complicated analytic formulae or on numerical integration) and for numerical simulations. While there is no doubt that such an approximation is valid for a universe with the corresponding mass distribution, recent work indicates that our Universe is not such a universe, but rather one in which the 'standard distance' (i.e. calculated under the assumption of strict homogeneity) is valid, even for small angular scales, at least to a good approximation.
Further refinements of this approximation are not disphillip.helbig@doct.uliege.be, helbig@astro.multivax.de cussed here, e.g. weak gravitational lensing with nonnegligible shear, strong gravitational lensing 1 , or inhomogeneities so appreciable that they influence the largescale geometry and/or expansion history of the universe ('back reaction'). Similarly, extensions to the FRW models, such as some sort of 'dark energy' other than the cosmological constant, are not considered; neither are those which violate the Cosmological Principle, e.g. ones in which we are within a large void, Lemaître-Tolman-Bondi models, etc. I also omit wrong results or misleading conclusions unless they have been often cited without all of the community noticing the mistake (either there was no correction or the correction has been ignored). The order is chronological in the sense that I discuss all papers on the first topic to appear, then all on the second topic, and so on.
I refer to the distance calculated based on the above approximation as the ZKDR distance, a term introduced by Santos & Lima (2006) and referring to Zel'dovich, Kantowski, Dyer, and Roeder, though I take the 'D' to refer to Dashevskii as well, my criteria for being part of the acronym being having (co-)authored at least two papers on this topic, at least one of which was published within ten years of the first paper on this topic (Zel'dovich 1964a,b).
In gravitational lensing, it is clear that the approximation of a completely homogeneous universe with regard to light propagation cannot be valid, since otherwise there would be no gravitational lensing. Perhaps for this reason, the ZKDR distance has been used more in gravitational lensing than in other fields. Since α is almost universally used to denote the gravitationl-lensing bend-ing angle, Kayser et al. (1997), hereafter KHS, adopted η instead of the more confusing α orα used by some other authors; since then, some authors other than KHS have also used η instead of α orα for the inhomogeneity parameter. In the following, I will use the notation of KHS except occasionally when explicitly referring to equations in the works of other authors, who use various and sometimes confusing notation schemes-in particular, using z for anything other than redshift in a paper on cosmology is very confusing (see Tab. 1).
2. ZEL'DOVICH (1964) Zel'dovich (1964b, hereafter Z64) 2 started the tradition; many (not only) today might find his paper somewhat idiosyncratic, difficult to follow, and wrong in parts, but he introduced a simple and useful basic idea: local inhomogeneities in the distribution of matter can lead to significantly different angular-size and luminosity densities from those derived under the assumption of a perfect FRW model.

Summary
The first attempt to calculate distances in a universe with small-scale inhomogeneities is, as far as I know, that of Z64. This begins a tradition of calculating distances in a more realistic universe, namely one with small-scale inhomogeneities, but where the large-scale dynamics is given by an FRW model. In other words, it is a perturbed FRW model: The zeroth-order approximation for cosmology, which is actually quite good (Green & Wald 2014), is that the Universe is described by a Robertson-Walker metric (Robertson 1935(Robertson , 1936Walker 1935Walker , 1937, the latter paper by Walker is very often incorrectly cited as having been published in 1936) which is a purely descriptive kinematic idea with no physics content, merely the characterization of a homogeneous and isotropic universe, and that the expansion history is given by one of the models explored by Friedmann (1922Friedmann ( , 1924) (hence FRW), which are in turn based on relativistic cosmology as introduced by Einstein (1917). Occasionally, the term Friedmann-Lemaître- Robertson-Walker (FLRW) is used to include a reference to Lemaître (1927); while he made important contributions to cosmology, none of them went beyond the work of Friedmann (1922Friedmann ( , 1924 with respect to the metric. 'It is assumed that. . . the amount of matter removed is small and the general motion is not affected.' The main model considered is 'a flat Friedman [sic] model with pressure equal to zero'. In modern notation, Ω 0 = 1 and λ 0 = 0. The physical model assumes that all matter exists in galaxies 3 and that distant objects are seen between galaxies, i.e. such distant objects 'do not have galaxies within the cone subtended by them at the observer'. (The cone is often referred to as the beam.) After the standard angular-size distance 4 is derived via a differential equation for the separation between two light rays, the deflection due to a point mass, equation (12), is used to calculate the deviation from the completely homogeneous case when the beam is devoid of matter. This equation is generalized to a uniform density distribution to calculate the total deflection, which is towards the outside since the removal of matter in the beam formally corresponds to negative mass. This leads to a differential equation which in turn leads to the expression for the angular-size distance in the Einsteinde Sitter model in the empty-beam case, denoted by f 1 in Z64. It is noted that this function 'increases monotonically right up to the [particle] horizon (∆ = 1) where it reaches the value 2/5'. The value 2/5 is exact, but the right-hand side of the unnumbered equation between equations (21) and (22), 1600, is too precise (though the correct value rounded to four digits is 1599, much closer than in the case discussed in Sect. 2.2).

Remarks
There are several strange things about this paper. First, the 'remarkable feature' that the angular-size distance has a maximum is noted. Second, it is claimed that this 'is caused by the curvature of space due to the matter filling the universe', which is strange because later in the paper the main model considered is a flat universe, i.e. one with no spatial curvature. Third, it is pointed out that the maximum 'occurs only when there is matter within the cone subtended by the object at the point of observation'. Fourth, for a modern reader, the notation is extremely bizarre; Tab. 1 shows the equivalents in modern notation of the quantities used. I now discuss each of these in turn.
What is remarkable about the fact that the angularsize distance has a maximum at some redshift? In modern notation, the angular-size distance D A is, by definition, l/θ, where l is the physical projected length of the object and θ the angle which it subtends, i.e. the angle at the observer formed by light rays from both ends of the object. 5 The triangle made by the object and the light rays retains its shape as the universe expands. Thus, ignoring curvature effects for the moment,  1 Note that t, t 0 , c, ρ are the same in the Z64 and modern notations. Z64 distinguishes between Θ and Θ 1 for the cases η = 1 and η = 0, respectively, though in both cases the quantity is the observed angular size. Similarly, Θ 2 is the observed angular size in the case of strong gravitational lensing. Except in the case of f 0 , quantities dependent on the cosmological model assume the Einstein-de Sitter model.

Z64 notation modern notation
the angular-size distance is the proper distance to the object at the time the light was emitted: The proper distance D P (sometimes written D p or D P ) is the distance which one could, in a gedankenexperiment , measure with a rigid ruler instantaneously (such that the distance does not change during the measurement due to the expansion of the universe). As such, it changes with time due to the expansion of the universe. Often, the co-moving distance is defined as the proper distance at the present time. Thus, the proper distance at a different time is simply the current proper distance divided by (1 + z), the time being that when the light of an object with redshift z was emitted. This agrees with the definition used by many authors, such as Berry (1986), who defines it as 'the distance measured with a standard rod or tape, in a reference frame where the events occur simultaneously'. Beware that sometimes the same distance is denoted by different symbols, e.g. d prop by Weinberg (1972), L by Harrison (1993), d by Sandage (1995), D by Davis & Lineweaver (2004), d p by Heacox (2015), d p by Ryden (2017), and sometimes also by different names, though it is clear from the discussion that the same distance as that called the proper distance by Weinberg (1972) is being discussed, e.g. 'distance between two fundamental particles at time t' (D 1 ) by Bondi (1961), 'tape-measure distance' (L) by Harrison (2000), 'instantaneous physical distance' by Carroll (2019); the term 'line-of-sight comoving distance' is also sometimes used, as opposed to the 'transverse comoving distance', which is very confusingly called the 'angular size distance' by Peebles (1993), who uses the term 'angular diameter distance' for what is called the angular-size distance by almost everyone else-indeed, the two terms are usually considered to be equivalent; the transverse comoving distance is the same as the proper-motion distance; see KHS.
At small redshifts, as the redshift increases, the object O R r x R χ D P Fig. 1.-Although the corresponding definitions are valid for models with k of 0 and −1 as well, easiest to visualize are distance definitions for the case k = +1. The universe can be thought of as a curved three-dimensional space, corresponding to the circle. Two dimensions are hence suppressed, so that the two dimensions in the plane of the figure can show the universe and its spatial curvature. R is the scale factor of the universe, as usual chosen to correspond to the radius of curvature. The observer is located at the top of the circle at O and observes an object located at x. D P , the length of the arc, is the proper distance to that object. For η = 1, the angular-size and luminosity distances (as well as other distances not discussed here such as the proper-motion distance and parallax distance) depend on r = R sin(χ) in a relatively simple manner (see KHS). Note that χ is constant in time; one can use it or σ = r/R, which is also constant in time, as the basis for a so-called co-moving distance.
was farther away (in proper distance) when the light was emitted, thus the angular-size distance increases with redshift. However, at large redshifts, light was emitted when the proper distance was small, long ago, but, due to the more rapid expansion of the universe in the past, is reaching the observer just now. Thus, at large redshifts, the angular-size distance, being the proper distance when the light was emitted, is small. This explains the 'remarkable' maximum. Another way of thinking of this is that the angular-size distance approaches zero as z approaches 0, but also as z approaches ∞, because the scale factor R (see Fig. 1) approaches 0 in such cases; in other words, the maximum in the angular-size distance depends on a finite particle horizon. Of course, not all cosmological models have a finite particle horizon and those that don't also have no maximum in the angularsize distance. This applies only to the standard distance, i.e. assuming complete homogeneity. For the ZKDR distance, it is of course possible that there is no maximum in the angular-size distance even though the universe has a particle horizon.
The above explanation is exact in a spatially flat universe, thus contradicting the claim that the maximum is somehow caused by the curvature of space. With spatial curvature, the angular-size distance corresponds not to the proper distance when the light was emitted, but rather to the coordinate distance r, defined as the product of the scale factor R and sin(χ), χ, or sinh(χ) for k equal to +1, 0, or −1, i.e. positive, zero, or negative spatial curvature, respectively; χ = D P /R (see Fig. 1.) This is analogous to the correction applied due to the curvature of the surface of the Earth when calculating the length along a parallel of latitude from the difference in longitude betweem the ends; the length (in the limit of small θ) is not D P θ but rather R sin(χ)θ, where χ is D P /R, D P being the distance measured along the surface of the Earth ('as the crow flies') and R is the radius of the Earth (assumed to be perfectly spherical). (Note that χ = π/2 − φ, where φ is the geographic latitude, if we think of the observer as being at the north pole.) Thus, this distance at first increases with increasing D P , though more slowly than in the flat case, reaches a maximum at the equator, then decreases to zero at the opposite pole. Fig. 1 illustrates various distances. One can see that for small χ, D P and r are approximately the same (exactly so in the limit D P = r = 0). When χ reaches 90 degrees, r (and hence the angular-size distance) reaches its maximum. For larger χ, the angular-size distance decreases, reaching 0 for χ = 180 • . It then increases again, reaching its maximum again at χ = 270 • , then decreases again, reaching 0 at χ = 360 • . The maximum value of χ depends on the cosmological model. Light travels along the circle from x to O. In an expanding universe, R was smaller when the light was emitted, hence, the distance defined via light-travel time is smaller than D P , while they coincide in a static universe. The ratio R 0 /R e , the scale factor now compared to the scale factor when the light was emitted, is equal to 1 + z. Distances related to r depend on η, while D P and the distance defined via light-travel time do not (the latter at least to a very good approximation.) (More precisely, the angular-size distance and other distances can be calculated relatively easily from r for η = 1. For η = 1, r still exists, but the relation between r and the angle defining the distance is changed, so the distance can no longer be simply calculated from r.) For η = 1, since D A = r/(1 + z), it is clear that for z = ∞ the angular-size distance must be zero.
That is another mechanism for the presence of a maximum in the angular-size distance. Consider first a static universe with positive spatial curvature (the Einstein model) and an observer at the 'pole'. For increasing proper distance, the angular size of a standard rod first decreases (i.e. the angular-size distance increases) up to a minimum at the 'equator', then increases again, becoming infinite at the opposite 'pole'. (This can continue indefinitely, with the angular size decreasing again as the proper distance further increases until the 'equator' is reached (but at the 'opposite side'), then increasing again until the object returns back to the observer, then decreasing again during the second loop around the universe, and so on.) Of course, in a static model there is no redshift, but there are quasi-static models where the universe expands very slowly. Large differences in proper distance correspond to small differences in redshift, and hence small differences in the scale factor at the time the light was emitted. If light is received from an object near the opposite 'pole', it will obviously have a much smaller angular-size distance than one near the 'equator', even though the scale factor was only slightly smaller when the light was emitted in the former case (thus it will have a slightly larger redshift). (Our Universe never went through such a quasi-static phase, so the first effect is more important in practice.) As noted above, the claim that the maximum is due to the curvature of space is strange, as it can exist in a flat universe; in particular, it exists in the first model considered in the paper, the Einstein-de Sitter model, which is spatially flat. (Perhaps he meant 'spacetime' rather than 'space'; it is not a wrong translation, since the original also has the Russian word for 'space' and not 'spacetime'.) The third point is more interesting: a maximum exists only if the beam is not empty. Since Z64 seemed surprised that the maximum exists, while I have shown above that it is perfectly natural to expect it, perhaps a better formulation is that the maximum disappears in the empty-beam case. I return to this in Sect. 3.1. We are so used to the redshift z as the principle observable quantity and proxy for distance that the use of ∆ = 1 − 1/(1 + z) is rather confusing. It does have the interesting property, though, that it ranges from 0 at the observer to a maximum of 1 for light emitted at the big bang. It follows from the simple definition given by equation (8) that ∆ = (ω 0 − ω 1 )/ω 0 , but note that 'ω 1 is the frequency of light received by the observer at time t 0 and ω 0 the frequency of light emitted by the object at time t'. Usually, '0' refers to the time of reception, but in any case various quantities almost always have the same indices to refer to the same times. 6 This is not a misprint, though, since other formulae which follow from this definition can be shown to be equivalent to more-familiar formulae; e.g. equation (9) corresponds to equation (B24) in the paper by KHS for Ω 0 = 1, i.e. the angular-size distance in the (completely homogeneous) Einstein-de Sitter model, denoted by f in Z64; in modern notation, this is There is something of a misprint in equation (2) of Z64: the character before the exponent '2' in the denominator, which looks like a dagger in a slanted font, should be t, and it is not part of the exponent, e.g. correct is ρ = 1 6πκt 2 ; perhaps the lower part of the t has not been printed; this is correct in the Russian orginal (Zel'dovich 1964a). (Equation (2) in Z64 follows from the standard definition Ω = (8πGρ)/(3H 2 ) for Ω = 1 and using equation (3) to express H in terms of t.) The bulk of the paper is the appendix (two and a half pages), which contains all the equations. The first oneand-one-half pages are essentially a non-mathematical summary, but also include several interesting points.
Figures 3 and 4 are never referred to in the text (neither in the translation nor in the original). Figure 3 seems to show the case in which the two ends of an object are multiply imaged, while figure 4 seems just to show the definition of an angle. Note that figure 6 is incorrect in that it appears that f 1 has a maximum for ∆ < 1; the angular-size distance never has a maximum for η = 0 (see Sect. 3).
The intergalactic medium is said to contain neutrinos and gravitons. Interestingly, gravitons have a rest-mass of 0 and neutrinos were believed to as well when the paper was written. Such particles thus correspond to a different equation of state (w = 1/3), though in this case that is irrelevant since the density is assumed to be negligible. If the density of such matter is not negligible, then matters become more complicated. In general, the term ρ above is ρ+ p, where p is the pressure. In the case of ordinary matter ('dust'), p = 0, hence ρ is sufficient. In the case of the cosmological constant, which can be thought of as a perfect fluid with ρ = −p, the two terms cancel; the only effect of the cosmological constant on the ZKDR distance is due to its effect on the expansion history of the universe. Other equations of state can in principle be taken into account in the ZKDR ansatz by including the corresponding ρ+p terms, but in such cases the concept of a single parameter η would be inappropriate, since one would not expect the various components to clump in the same manner.
Gravitational lensing is mentioned for the case when there is a galaxy within the cone, confusingly citing Fritz Zwicky (see below in this section). In this case, it is noted that no general expression can be derived, but that (the equivalent of) the angular-size distance as a function of z is given by a weighted mean, though this is not defined, much less derived. Though worded somewhat confusingly, it is pointed out that a mass outside the cone acts, to first order, as a pure-shear gravitational lens, distorting though not changing the area subtended by the object and (due to the conservation of surface brightness in gravitational lensing 7 , implicitly assumed here) thus also not changing the apparent magnitude.
His equation (11) looks suspicious because the righthand side of 1200 appears to be too round a number. To the same precision, the correct value is 1184. (A more precise value is 1184.365. Of course, this much precision is not needed, but usually all quoted figures are correct. If two significant figures are sufficient, then 1.2 × 10 3 would make more sense.) Units, not explicitly mentioned, are Mpc. (Note that the units of H are 'km/sec · Mps', normally written 'km/s/Mpc' or 'km/(s·Mpc)' or 'km s −1 Mpc −1 '.) As mentioned above, it is noted that 'the function f 1 increases monotonically right up to the [particle] horizon (∆ = 1) where it reaches the value 2/5'. However, the plot of this function in figure 6 clearly shows a maximum for ∆ < 1, after which the value decreases somewhat.
The reference Zwicky (1937c) is wrongly assigned the year 1927. That reference contains only a very short and general discussion on 'nebulae as gravitational lenses' and does not address the phenomena mentioned in the text. It does say that a more detailed description will be provided in Helv. Phys. Act., but that is not the paper in that journal mentioned in reference 4 (Zwicky 1933), a paper in German on various aspects of the redshift of ex-7 Since gravitational lensing conserves surface brightness, magnification (increase in area) implies amplification (increase in apparent brightness, i.e. energy per time from the source received at the observer). In this sense, the terms are interchangeable. However, one or the other term can be more appropriate depending on the phenomenon discussed, e.g. 'amplification' when discussing the change in apparent magnitude of a lensed source and 'magnification' when discussing the size of an extended source. In the case of the number of sources in a certain range of apparent magnitude in a given area of sky, both effects play a role, and whether there is an increase or decrease depends on the luminosity function. tragalactic nebulae, which doesn't mention gravitational lensing at all; among other things, Zwicky points out that the dispersion of velocities of galaxies in the Coma cluster indicates that the density of dark matter must be at least 400 times that of luminous matter-and, of course, this was written before Zwicky (1937c). Zwicky (1937a,b) are the (two short) papers which discuss nebulae as gravitational lenses, both cited by Zwicky (1937c).
2.3. Discussion Z64 presented analytic formulae for the angular-size distance for three cosmological models: Ω 0 = 1 and λ 0 = 0 (Einstein-de Sitter) for the values η = 1 (standard distance) and η = 0 (the main result of that work), as well as for Ω 0 = 0 and λ 0 = 0 (the general-relativistic equivalent of the Milne model; since the density is 0 in this case, the value of η doesn't matter).
Z64 alerted people to the fact that the standard distances, which assume complete homogeneity, are perhaps not appropriate, and demonstrated that effects due to a universe with small-scale inhomogeneities can be appreciable. He also introduced the idea of calculating the effect as a negative gravitational-lens effect, based on simplifying assumptions rather than calculating it for an analytically soluble (but perhaps less realistic) case.

DASHEVSKII & ZEL'DOVICH (1965)
Dashevskii & Zel'dovich (1965, hereafter DZ65) 8 derived an expression for the angular-size distance for the case of a completely empty beam for arbitrary values of Ω 0 (λ 0 = 0 is still assumed). Compared to Z64, it is more general with respect to the large-scale cosmological model. They noted that the expression does not have a maximum.

Summary
As noted above, Z64 claimed that the maximum in the angular-size distance (in the case of the Einstein-de Sitter model studied) 'is caused by the curvature of space due to the matter filling the universe'. This is somewhat dubious, since the Einstein-de Sitter model is spatially flat. DZ65 have a perhaps somewhat better formulation, claiming that the 'effect depends on the bending of light rays by matter present within the light cone' and assert that 'it follows from this that for objects in whose light cone there is by chance no matter there should be no minimum angular diameter right up to the [particle] horizon'. The claims are true, but one can ask whether their explanation is the best one. (As will be discussed in Sect. 4.1, there is always a maximum as long as the beam is not completely empty, though the emptier the beam, the higher the redshift of the maximum.) In addition to the wider range of cosmological models considered, DZ65 derive the expression via a different, though equivalent, route. No analytic solutions are presented, but f and f 1 are plotted as functions of ∆ for a few values of Ω 0 , and for a few more values of Ω 0 , ∆ max (the value of ∆ at which the maximum in the angular-size distance for η = 1 occurs) and the values of f at ∆ max and f 1 at ∆ = 1 are tabulated. In addition, there is a column for Ω 0 = ∞, where ∆ max = 0.25, f (∆ max ) = 0.65/ √ Ω and f 1 (∆ = 1) = 1.18/ √ Ω. This is not mentioned in the text, but is apparently an approximation for Ω 0 ≫ 1. I have checked this numerically and found that their approximation answers pretty nearly. (Of course, ∆ max also depends on Ω 0 , though less sensitively than f (∆ max ) and f 1 (∆ = 1).) Note that an analytic solution, though a rather complicated one, for the case λ 0 = 0 and η = 0 does exist, first derived by Dyer & Roeder (1972); cf. KHS, equation (B15). For λ 0 = 0 and η = 1, the formulae derived by Mattig (1958) apply. 9 Several interesting features are pointed out in the text and/or are obvious from the figure (if Ω 0 is not mentioned, then the effect is independent of the value of Ω 0 ): • The angular-size distance for η = 0 increases monotonically with redshift.
• The angular-size distance for η = 0 is less than the light-travel-time distance c(t 0 − t) and larger than the angular-size distance for η = 1 (at least for λ 0 = 0).
• The angular-size distance for η = 0 has its maximum value at z = ∞.
• The angular-size distance for η = 1 has a maximum at z < ∞.
• The value of the maximum of the angular-size distance for η = 1 increases with decreasing Ω 0 .
• The redshift of the maximum of the angular-size distance for η = 1 increases with decreasing Ω 0 .
• Both for η = 0 and η = 1, the value of D A at any redshift increases with decreasing Ω 0 .
• For given values of Ω 0 and z, D A for η = 0 is always larger than D A for η = 1.
DZ65 end with remarks on the 'validity of the method proposed in the paper', the validity being guaranteed by the fact that they 'are adding small effects in the linear region'.

Remarks
The title is also confusing, since there is no paper with a similar title but with 'I' instead of 'II'. It is clear from the first sentence, though, that Paper I is Z64. The theme of confusing notation continues. What Z64 called r, DZ65 call z. While r is often used for a length of some sort, this is less common for z. Of course, the fact that z is normally used for the redshift adds to the confusion. What Z64 called Θ, DZ65 call φ. DZ65 adopt the usual convention of using the suffix 0 to denote the present time, in this case the time of observation and the time the radiation reaches the observer. Hence, what Z64 called ω 1 , DZ65 call ω 0 , and what Z64 called ω 0 , DZ65 call ω t .
Criticizing Wheeler (1958), DZ65 note that the claim that the maximum occurs only in the case of a spatially closed universe is wrong.
I have calculated the values in their table 1, but in two cases find different values, namely 0.42 (0.421) instead of 0.40 for f (∆ max ) for Ω 0 = 1/10, and 0.24 (0.237) instead of 0.23 for f 1 (∆ = 1) for Ω 0 = 10. I suspect that the former is a misprint while the latter could be as well, or possibly due to roundoff error in a less accurate numerical calculation.
3.3. Discussion DZ65 presented an integral for the angular-size distance for cosmological models with λ 0 = 0 but arbitrary Ω 0 for η = 0 and compared the corresponding distances to those with η = 1. Although no analytic solution was presented, DZ65 extended to η = 0 the idea of calculating distances for various values of Ω 0 (though still setting λ 0 = 0). Around the same time, much more extensive numerical calculations were done by Refsdal et al. (1967), only for η = 1 but for several values of Ω 0 and λ 0 .

DASHEVSKII & SLYSH (1966)
Dashevskii & Slysh (1966, hereafterDS66) 10 generalized the method of Z64 and DZ65 to the more realistic case that the beam is not completely empty, but only for the Einstein-de Sitter model.

Summary
The empty-beam case is criticized as being too unrealistic, as there will always be some intergalactic matter; this will mean that there will always be a maximum in the angular-size distance. DS66 derive, in their equation (2), the second-order differential equation which is the basis for all further work in this field 'which determines the linear distance z(t) between rays', with ρ g = αρ (the subscript g refers to the smooth component, considered as a 'gas at zero pressure that fills all space uniformly' [my emphasis], the rest of the 'matter being concentrated in discrete galaxies'); a is the scale factor and G the gravitational constant. Compared to Z64 and DZ65, they allow α (in the notation of KHS, η) to take an arbitrary value 0 ≤ α ≤ 1; η is thus completely general. The cosmological model is implicit in the terṁ a/a, in principle allowing one to study any cosmological model in whichȧ/a can be calculated, but DS66 then restrict themselves to the Einstein-de Sitter model for the subsequent discussion, presenting a completely analytic solution for the angular-size distance for this cosmological model, namely the first unnumbered equation in DS66, which is a generalization of equation (10) in Z64. DS66 point out that, for arbitrary 0 < η ≤ 1, the angular-size distance has a maximum at finite z and the angular-size distance goes to 0 for z = ∞. Also, the smaller the fraction of homogeneously distributed matter, i.e. the smaller η, the higher the redshift of this maximum. Without proof, it is stated that this result also holds in the case of non-zero pressure.

Remarks
It is not clear why equation (3) is the last numbered equation; perhaps because the following equations are not referred to in the text (but, like the others, are of course part of the text). Also confusing is the expression 0 ≤ α ≤ 1.1 ≤ k ≤ 5, which should be 0 ≤ α ≤ 1, 1 ≤ k ≤ 5. As in Z64,f 1 11 , i.e. the angular-size distance for η = 0, is incorrectly shown as having a maximum at finite z (a mistake also made by DZ65, though barely perceptibly; in all cases, these are probably due to the figures having been drawn by hand). Also, there should be no inflection in the dashed curve.

Discussion
The generalization to an arbitrary value of η is obvious; less obvious is the relatively simple analytic solution for arbitrary η for the Einstein-de Sitter model.

OTHER PAPERS I
(Being discussed in an 'other papers' section does not imply that the paper lacks quality or influence; quite the opposite, in fact. Rather, these sections discuss papers which are not directly relevant to the main theme of this review, but nevertheless played some role in it.) Kristian & Sachs (1966) discuss what I like to call 'theoretical observational cosmology' for very general (i.e. anisotropic, inhomogeneous) cosmological models, not necessarily based on general relativity (GR) (of which the FRW models-homogeneous and isotropic models based on GR-are special cases), mainly for inhomogeneities on the scale of 10 9 light-years or more (with small-scale inhomogeneities considered to be smoothed out, i.e. in some sense the reverse of the assumptions above). Many results, after 'straightforward, though somewhat tedious' calculations, are given in terms of series expansions. A key result is that the relation d A = r 2 dΩ (in their notation), where 'd A is the intrinsic crosssectional area of a distant object; r is a measured quantity, the "corrected luminosity distance," defined by equation (19); and dΩ is the meaured solid angle subtended by the distant object' is very general and holds in all cosmological models, whether or not they are based on GR. At the time, observations were not good enough that one could be sure that the Universe is actually very well described by an FRW model, hence the emphasis on generality and discussion of possible observations which could be used to determine the many more parameters than those needed to specify an FRW model. Bertotti (1966) cites Z64 and DZ65 (erroneously making Dashevskii an author of Z64 as well), but considers not just the increase in the angular-size distance as compared with the standard FRW case, but also the decrease (corresponding to amplification) due to the gravitationallens effect, both strong lensing and weak lensing, i.e. 'the small, but distance-dependent, brightening caused by near galaxies' which leads to a 'statistical spread in luminosity', shown to be proportional to (D A ) 3 for small distances. The main result is an expression for apparent luminosity as a function of redshift, noting that, in the inhomogeneous case, the first correction is quadratic in redshift and produces a dimming, but for higher values of z the brightening due to gravitational lensing becomes more important. That expression is for arbitrary Ω 0 12 and arbitrary η (called f ), i.e. the case considered by DS66 13 but expressed as a series expansion. It is also noted that, to first order, the correction to the Euclidean relation to the expression for the number of sources brighter than a given apparent luminosity does not depend on η.
Gunn (1967a) also examined statistical fluctuations due to gravitational lensing, but in position, not apparent magnitude. This was done in more detail by Fukushige & Makino (1994), who pointed out that 'the distance between nearby photons grows exponentially because the two rays suffer coherent scatterings by the same scattering object'. Gunn (1967b) extended the discussion to fluctuations in apparent magnitude. Feynman, in a colloquium at Caltech, had discussed a scenario similar to that discussed by Z64, concentrating on the effects on angular diameters, apparently not realizing that apparent magnitude would also be affected. For the topic of this review, the most important result is the realization that, for large-enough redshifts, average luminosities and angular sizes will be the same as in the strictly homogeneous case, because not all lines of sight can be underdense, though there will be a scatter in their values compared to those in a strictly homogeneous universe. Babul & Lee (1991) discussed Gunn's formalism in more modern notation, adopting some simplification and deriving some new analytic results. Although only the Einstein-de Sitter model was considered (with-as extreme positions-a spectrum of mass fluctuations derived from CDM and a white-noise spectrum), their conclusions probably apply more generally, namely that the dispersion in amplification due to large-scale structure is negligible, while that on small scales depends strongly on the nature of the distribution. Refsdal (1970) also discussed changes in the apparent luminosity and shape of distant light sources due to intervening inhomogeneities, but using a numerical raytracing approach rather than the more analytic methods of the works discussed above. (As would become clear later, this allows the effect of very concentrated masses, e.g. stars, to be taken into account, as well as general fluctuations due to galaxies and large-scale structure. In other words, it can handle strong lensing as well.) Raytracing simulations were done for a static flat universe (with all the mass in point massees, e.g. η = 0), but the results were generalized to an interesting collection 12 As was the custom at the time, this was written in terms of q 0 , i.e. q 0 = Ω 0 /2 under the assumption λ 0 = 0. The reason that q 0 -in general, q 0 = Ω 0 /2 − λ 0 (or, as was common at the time, q 0 = σ 0 − λ 0 , where σ 0 = Ω 0 /2)-was used is that q 0 is, after H 0 , the next-higher term in series expansions of observational quantities as a function of redshift (e.g. Hoyle & Sandage 1956) 13 Since Bertotti (1966) was submitted around the time that DS66 appeared, presumably the former was derived independently of the latter and vice versa. of cosmological models: Einstein's static universe 14 , two models with λ 0 = 0 (Ω 0 = 0.3 and Ω 0 = 2), and a model with Ω 0 = 0.4 and λ 0 = 1.7 (a spatially closed model which will expand forever with an antipode at z ≈ 4).
In retrospect, one conclusion was very prescient: An interesting aspect of the problem is the possibility of using the effect to obtain information on the mass distribution in the Universe. Even if the effect is not observable after some systematic efforts to detect it, one should be able to determine upper limits on the number of condensed and massive objects in the Universe. Press & Gunn (1973) pointed out that (at least for λ 0 = 0) if Ω 0 is due mainly to compact objects, then the probability is high that a distant source will be multiply imaged, independently of the mass of the objects (which does, of course, set the scale of the image separation). (At the time, it was not clear that most of Ω 0 consists of non-baryonic matter, and, since arguments against a substantial density of intergalactic gas had been presented, it seemed natural to look for the missing matter in compact objects.) A more detailed analysis shows that the lack of dependence on the mass is exact, while the image separation has a weak dependence on Ω 0 . 15 In contrast to the other papers in this section and that in the next section, the emphasis is on detecting the scattering masses, not the influence of those masses on observable properties of the sources. Nevertheless, the ZKDR distance was used, in particular the extreme empty-beam case, with the lensing effect of individual clumps explicitly taken into account.
6. KANTOWSKI (1969) Kantowski (1969, hereafter K69) took a somewhat different approach, using Swiss-cheese models (Einstein & Straus 1945, 1946. These are arguably less realistic than the approximation used in the papers discussed above, since in these models clumps of matter are surrounded by voids with ρ = 0. However, since these models are exact solutions of the Einstein field equations, the validity of approximations used to calculate the angular-size distance is not an issue (though, of course, one can question the validity of this approximation to the distribution of matter).

6.1.
Summary 'The Swiss-cheese models are constructed by taking a Friedmann model (p = Λ = 0), randomly removing comoving spheres from the dust, and placing Schwarzschild masses at the "center" of the holes.' K69 makes five realistic assumptions in order to facilitate calculations: the Schwarzschild radii of the clumps are very small compared to their opaque radii, the size of the Swiss-cheese hole is much larger than the opaque radius, the change 7. DYER & ROEDER (1972) Dyer & Roeder (1972, hereafter DR72) discussed the completely empty-beam case; despite starting out with an expression for arbitrary Ω 0 and λ 0 (using the standard notation at the time with σ 0 = Ω 0 /2 and q 0 = σ 0 − λ 0 ), results were presented for σ 0 = q 0 , i.e. Λ = 0 (and hence λ 0 = 0).

Summary
For an integral expression for the angular-size distance, analytic solutions are presented for the three cases Ω 0 < 1, Ω 0 = 1, and Ω 0 > 1; only the much simpler solutions for Ω 0 = 1 (Z64) and Ω 0 = 0 (Mattig 1958) (see also Z64) were previously known. As was also pointed out by DZ65, there is no maximum in the angular-size distance for η = 0. The famous result of Etherington (1933), is invoked to note that an empty beam leads to a lower apparent luminosity which, as discussed by Kantowski (1969), leads one to underestimate q 0 if a completely homogeneous universe is assumed; their example has a real value of q 0 = 1.82 which, if calculated assuming a completely homogeneous universe, results in the value q 0 = 1.40. Kantowski (1969) had a real value of q 0 = 2.2 being interpreted as q 0 = 1.5. The exact numbers are not important; the point is that, to first order, the ZKDR distance is larger than in the standard case, which is also the case for a lower value of q 0 . But this is only to first order; with higher-redshift data, the two effects are not degenerate. It is also shown that, while the difference between the ZKDR distance and the standard distance is nonnegligible, there is little difference between the ZKDR distance and that obtained by numerical integration in a corresponding Swiss-cheese model (which, as mentioned above, is not an exactly equivalent model).

Remarks
Compared to the papers discussed above, especially the first three, there is much less emphasis on physical models and more on mathematical results. Also, comparisons are done between a relatively simple formula and a more involved numerical integration based on a more complicated mass distribution. Dyer & Roeder (1972) covered the same ground as DZ65, but more thoroughly, presenting an analytic solution.

Discussion
The distance for an empty or partially filled beam has become known as the Dyer-Roeder distance, although various aspects had been discussed before. This is probably due to the fact that the corresponding papers were published in a major English-language journal, used standard notation, and were more concerned with results than with theory. Dyer and Roeder were certainly responsible for putting the topic on the agenda of many astronomers. However, for the reasons outlined above, I refer to this distance as the ZKDR distance.

Summary
As in DR73, general discussion is narrowed down by setting λ 0 = 0 before explicit solutions are presented. Second-order differential equations for both the angularsize distance and the luminosity distance are derived, though of course once one has a solution one can use the Etherington reciprocity relation to simply derive one from the other. Using a substition, these are converted to hypergeometric equations.
The special case η = 1 is the solution derived by Mattig (1958) while that for η = 0 is that derived by DR72. New is a solution for η = 2/3, which is given for the luminosity distance. For Ω 0 = 1, one has the solution derived by DS66, which is given for the angular-size distance. Differentiation of that equation leads to an expression for the maximum in the angular-size distance, showing that as η goes from 1 to 0, the redshift of this maximum goes from 1.25 to ∞. The point first made by Z64, that the maximum is due to matter in the beam, is emphasized. (Note, however, that an arbitrarily small η will lead to a maximum, though at arbitrarily large z.) They suggest comparing observations with calculations for each of the three values of η for which there is an analytic solution, given the lack of knowledge about intergalactic matter. Finally, as in DR72, they note that calculations for Swisscheese models (interestingly, including λ 0 = 0) confirm that this is a good approximation, i.e. 'the mass deficiency in the beam is in general much more important than the gravitational-lens effect for reasonable deflectors', at least for 'redshifts in the range of interest'.

Remarks
There is a huge literature on hypergeometric functions, and many well known functions, including many used in phyics, are special cases, but in general it is not possible to reduce hypergeometric functions to (combinations of) standard functions which are easily and efficiently calculated, either analyically or numerically. As such, the fact that the distance equations are hypergeometric equations is interesting, but (except for the analyically soluble special cases) of little practical use. 8.3. Discussion DR73 can be seen as a combination of DZ65 and DS66, i.e. Ω 0 and η are both arbitrary (though λ 0 = 0 was still assumed).
Starting with the Einstein-de Sitter model, Z64 had investigated η = 0, presenting an analytic solution (as well as one for Ω 0 = 0, in which the value of η irrelevant since there is no matter). DZ65 had expanded this to arbitrary Ω 0 , though no analytic solution was presented. DS66 had returned to the Einstein-de Sitter model, but allowed η to be arbitary. DR72 had covered the same ground as DZ65, but presented an analytic solution. Finally, DR73 addressed the most general case so far, with both Ω 0 and η as free parameters, and presented analyic results (most already known) for special cases.

Summary
After a short review of previous work on the topic, the method of Kantowski (1969) is extended to λ 0 = 0. Essentially, λ 0 = 0 affects the expansion history of the universe but nothing else; in particular, R 0 /R = 1 + z still holds. A second-order differential equation for (a quantity simply related to) the angular-size distance is presented, but no solution is given. It is noted that a 'series solution about z = 0 can be obtained', but the emphasis is on calculating the correction factor relative to the homogeneous model of the same mean density; there is a series expansion for this, but it breaks down by the time the redshift has become high enough for the effect to be interesting, so results have to be calculated numerically. With regard to distortion of the beam, they show that a beam retains its elliptical cross section, though orientation and ellipticity can change.
In Swiss-cheese models, the structure of the clumps must be taken into account, but for realistic assumptions (assuming that the clumps model galaxies), 'the calculations indicate that. . . the distance-redshift relations do not differ signifancly from the "zero-shear" relations discussed in [DR72 and DR73]. Similarly, the distortion effect has been found to be negligible in the range of redshifts observable at present, being at most a few percent.
' Although the Swiss-cheese models are perhaps unrealistic in that real galaxies are not usually surrounded by a region of lower than average density, they do show the potentially real effect that there is a dispersion in the distance calculated from redshift which increases with redshift. (In Sect. 19 it is discussed how important this is for our Universe.) Another important result is that the dependence of the distance-redshift relation on Ω 0 is increased for η ≈ 0, thus reducing the precision obtainable in practice. Previous conclusions mentioned above that decreasing η means that observations interpreted assuming that η = 1 will underestimate q 0 are repeated.

Remarks
Calculations involving the Swiss-cheese models are inherently statistical in nature and more complicated than those based on approximations. The models are even arguably less realistic. However, they are important because, being exact solutions to the Einstein equations, one does not have to worry about approximations. The fact that results are very similar to those based on simpler assumptions is encouraging, and provides justification for using the simpler approach. It could of course be the case that this approach is too simple for the real Universe, but in that case a Swiss-cheese model would also probably be too unrealistic.
9.3. Discussion DR74 is interesting because it presents for the first time distance-redshift relations in a universe with arbitrary Ω 0 , λ 0 , and η. However, not only because the calculations are based on Swiss-cheese models, no closed formulae are given. Roeder (1975a) applied the work of DR73 to the data of Sandage & Hardy (1973), concluding that the value obtained for q 0 depends both on assumptions about (in)homogeneity and on galaxy evolution and suggesting q 0 > 0.5 if the conclusion of Gott et al. (1974) is assumed, namely that η ≈ 0. Roeder (1975b) applied the conclusions of DR73 to a claim by Hewish et al. (1974) that there is a lack of smalldiameter sources at the largest redshifts, whereby they assume the standard angular-size distance. If η < 1, then the angular-size distance is larger than otherwise, and if one wrongly assumes η = 1, then one will underestimate the true physical size of the source. Thus, an inhomogeneous Universe is not a possible explanation of that claim; rather, it would exacerbate the problem.

FURTHER SOLUTIONS (ANALYTIC AND NUMERICAL)
OF THE ZKDR DISTANCE 11.1. Kayser, Helbig & Schramm (1997) Increasingly general equations (EQ), analytic solutions (AS), and numerical calculations (NC) had been presented in the 1960s and 1970s (AS implies EQ) (all but the last two below discussed above): The only expression available for λ 0 = 0 was a complicated differential equation derived by Dyer & Roeder (1974), but for Swiss-cheese models. No closed solution was presented. Of course, it can be integrated numerically. However, it is rather cumbersome, and the terms do not have an obvious physical interpretation like those in the differential equations of Z64 and DS66. While it was appreciated that Swiss-cheese models are in some sense equivalent to the ZKDR distance derived via the Zel'dovich method, this was not shown strictly until much later (Fleury 2014). Thus, between the work of Dyer & Roeder (1976) and Kantowski et al. (1995), work on the ZKDR distance concentrated mostly on understanding the approximation, applications (both in more-traditional cosmology and in gravitational lensing), and, to some extent, more-realistic models (this field would come into its own only later, when computer power allowed more-complicated scenarios to be investigated). However, the development of the basic ZKDR distance picked up again later. Kayser (1985) derived a differential equation for the angular-size distance in the style of Z64, DS66, and DR73, but for 0 ≤ η ≤ 1 and arbitrary values of λ 0 and Ω 0 , which he integrated numerically via standard but basic means. Kayser et al. (1997) saw a need for an efficient numerical implementation of that equation, which is the most general equation for the ZKDR distance under the standard assumptions that the universe is a (just slightly) perturbed FRW model (i.e. no pressure, no dark energy more complicated than the cosmological constant, no back reaction, only Ricci (de)focussing; even today, there is no evidence that the first three are not excellent approximations, and the fourth is as well in many cases). Also, no efficient general implementation existed for the standard (η = 1) distance. 16 Thus, a description of the differential equation derived by Kayser (1985) and the efficient numerical implementation-using the Bulirsch-Stoer method in Fortran (see Helbig 1996, for technical details)evolved to include a general description of various types of cosmological distances and a compendium of analytic solutions, probably the first time all this information had been presented in a uniform notation. Despite being a numerical (though very efficient) implementation, it is only a factor of ≈ 3 slower than elliptic-integral solutions for η = 0 or η = 2/3 (Rollin Thomas, personal communication); for η = 1, the factor is ≈ 20 . Of course, a comparison can be done only for those cases where elliptic-integral solutions exist, but the numerical-integration time for the differential equation, valid for all values of λ 0 , Ω 0 , and η, is essentially the same whether or not an elliptic-integral or analytic solution exists. (Analytic solutions are of course faster than elliptic-integral solutions, which can be described as semi-numerical or semi-analytic; in general, the ellipticintegral solutions do not work if there is an analytic solution (an exception being the expression for light-travel time in a flat universe).)

TESTING THE APPROXIMATION
Unlike the Swiss-cheese model, the ZKDR distance is an approximation based on various assumptions. While it is reasonably clear that it must be correct in the appropriate limit (i.e. the light propagates very far from all clumps, the fraction of mass in clumps is negligible so that it is clear that an FRW model is a good approximation, etc.), it is not immediately clear how good the approximation is in a more realistic scenario. One way to test this is to compare the ZKDR distance to an explicit numerical calculation, namely following photon trajectories through a mass distribution produced by a cosmological simulation. Some of this work will be mentioned below in Sect. 28. Watanabe & Tomita (1990), building on work by Futamase & Sasaki (1989), solved directly the equations of null geodesics and explicitly calculated the shear. Only the Einstein-de Sitter model was considered, and the explicit calculations were compared to the ZKDR distance for η = 1 and η = 0. The former is the better fit for the average distance, but it was assumed that mass is transparent, so this result essentially follows from flux conservation (Weinberg 1976). Kasai et al. (1990) carried out a similar study, noting that, as expected, the distance-redshift relation depends on angular scale, with the standard (η = 1) distance appropriate for large angles and the ZKDR distance (in the limiting case, η = 0) for small angles, a conclusion also arrived at by Linder (1998). His numerical result was demonstrated analytically by Watanabe & Tomita (1991). Similar results were found by Giblin et al. (2016b), who used a much more realistic model of the mass distribution, based on state-of-the-art simulations ('the first numerical cosmological study that is fully relativistic, non-linear and without symmetry') (Giblin et al. 2016a;Mertens et al. 2016). They stressed the scatter in the distance for a given redshift, which generally increases with redshift and is also dependent on the line of sight. Nakamura (1997) numerically investigated the effect of shear on the angular-size distance in a linearly perturbed FRW model and found it to be negligible, thus justifying the ZKDR distance. (For the Einstein-de Sitter model, an analytic result was presented.) Okamura & Futamase (2009), while not setting out to test the ZKDR distance, found that a universe with the halo-mass function of Sheth & Tormen (1999) is, remarkably, well approximated by the ZKDR distance with the η parameter calculated from their model. Busti et al. (2013) compared the ZKDR distance to other approximations: the weak-lensing approximation with uncompensated density along the line of sight, the flux-averaging approximation, and a modified ZKDR distance which allows for a different expansion rate along the line of sight. This work is interesting for its analysis of the underlying issues (essentially assumptions about the mass distribution and how this affects light propagation, different approximations corresponding to different assumptions) and its combination of detailed theory and application to real data-the Union2.1 sample, also used by Helbig (2015a) and Yang et al. (2013). Weinberg (1976) pointed out that in a locally 18 inhomogeneous universe in which gravitational deflection by individual clumps is taken into account, the conventional distance formulae remain valid on average as long as the clumps are sufficiently small, while for galacticsize clumps, this depends on the selection procedure and redshift of the source.

WEINBERG (1976)
13.1. Summary For a locally inhomogeneous universe, the average apparent luminosity (for the case λ 0 = 0, but this is true in general) is given by the conventional formula, e.g. that due to Mattig (1958), rather than the empty-beam formula, e.g. that investigated by Dyer & Roeder (1972). The reason is clear: the empty-beam formula 'leaves out the gravitational deflections caused by occasional close encounters with clumps near the line of sight'. Moreover, '[t]hese gravitational deflections produce a shear which on the average has the same effect in the optical scalar equation as would be produced in a homogeneous universe by the Ricci tensor term'.
The special case of q 0 ≪ 1 is considered, in which the average number of clumps close enough to the line of sight to produce an appreciable deflection is of order q 0 for z ≈ 1 (Press & Gunn 1973). Even in this case, where multiple deflections can be ignored, the standard formula is appropriate when considering the average distance. The decrease in the luminosity distance due to gravitational lensing cancels the increase due to the empty-beam formula.
This result is generalized to models with arbitrary q 0 and transparent intergalactic matter via a simple argument: due to flux conservation, the conventional distance must hold, on average; not all lines of sight can be underdense, and occasional lines of sight with strong amplification due to gravitational lensing exactly balance the larger number of underdense lines of sight. However, this ignores the selection effect that there can be no opaque clump between the source and the observer. If the clumps are dark stars, the conventional distance formula is a very good approximation, but only marginally so for galaxy-size clumps. The important quantity is the radius of avoidance, which could lead to the emptybeam distance being more appropriate at low redshifts and the conventional formula at high redshifts. 19 Details depend on selection effects: perhaps distant objects are observed (by accident or by design) on lines of sight which avoid clumps (and hence absorption); on the other hand, amplification bias might cause objects which have been gravitationally amplified to be observed preferentially.
The empty-beam distance is nevertheless useful since it gives a lower limit on the apparent luminosity (for a given absolute luminosity) at a given redshift. In general, there is a scatter in luminosity distance, comparable to the difference between the empty-beam and filled-beam formulae. Also, it is noted that the standard distance should be used to calculate the mean inverse-square luminosity distance, not the mean luminostiy distance itself. Weinberg speculates that this might be part of the reason for the difference in apparent luminosity between quasars at the same redshift. Weinberg (1976) is not concerned with developing the theory of the ZKDR distance; in fact, he doesn't go beyond DZ65. Rather, the emphasis is on understanding the validity of the approximation, its domain of applicability, and its use in a statistical context.

Discussion
This paper has been cited many times, perhaps because Weinberg is well known, but probably mainly because it is clear and to the point. Not until much later were more-detailed analyses presented.
14. OTHER PAPERS III Wardle & Pottash (1977) discussed the effect of the ZKDR distance on the angular sizes of quasars, noting 'that the median angular size in fact decreased with redshift faster than expected in any Friedmann cosmology. This implied that there was a deficiency of sources of large linear size at high redshifts' [emphasis in the original]. 20 A cosmological model with η < 1 could at least partially explain this. Wagoner (1977) discussed determining q 0 from the m-z relation for supernovae, noting in passing that the Dyer-Roeder distance can be used. While not dwelling on the question of distance calculation, the paper is one of the first to advocate determining cosmological parameters from the m-z relation for supernovae rather for galaxies, deemed to be worth pursuing mainly because of the lack of knowledge about galaxy evolution. Ellis (1980) noted that the uncertainty in η needs to be considered when attempting to derive cosmological parameters from observations. Ellis would later return to this topic many times.

THE END OF AN ERA
The work by Weinberg (1976) marks a turning point, for two related reasons. First, the theory is now more or less complete; future work would be concerned with refinements. Second, the development of theory is now secondary to applications, at least in terms of numbers of papers. The three papers mentioned in Sect. 14 are in some sense obvious consequences of the theory as known when they were written; most future work would be more limited in scope but also more detailed. As such, it makes sense to switch from the mainly chronological discussion presented until now to a discussion based on topic. (Nevertheless, some chronology is retained: topics are presented in the order of their appearance, and the discussion of each topic is roughly chronological. The order of the topics is based not on the average age of the papers, but rather on the time of publication of the first one.) Though some build on somewhat earlier work (some of which has been mentioned above), most of these topics were investigated after the work of Weinberg (1976).
Before doing so, however, the influential work of Canizares (1982) deserves special mention. Building on the work of Press & Gunn (1973), who had concentrated on the production of multiple images by compact objects, he discussed other observational effects. As such, this work belongs more in the gravitational-lensing camp than in the light-propagation camp. It also appeared at a time which saw a rapid increase in the number of papers devoted to these two topics. Obviously, the discovery of the first gravitational-lens system by Walsh et al. (1979) played a role as far as gravitational lensing itself was concerned; but probably because gravitational lensing forces one to think about the degree of homogeneity between source and observer, many studies were done which looked at further applications of the ZKDR distance, and, somewhat later, refinements to and extensions of the basic theory were investigated.
The next 16 sections, discussing various applications of the ZKDR distance, are chronological with respect to the first paper discussed in each. These are followed by a section discussing analytic approximations; the final section is a summary.

FLUX CONSERVATION 1
Weinberg (1976) pointed out that the standard distance formula, e.g. assuming η = 1, must hold on average if lenses are transparent and there are no selection effects. This is due to flux conservation. Dyer & Roeder (1981a) considered the effect of a finite source size in gravitational lensing, concluding that, all else being equal, η increases with the size of the source. (The fact that almost all beams are underdense and hence the average magnification is less than 1 is offset by the occasional stronglensing event.) The important quantity is not the size of the source per se, but rather the size of the source relative to the clumps; as already mentioned by Weinberg (1976), one could think of η increasing with redshift since, due to structure formation, matter was more uniform at high redshift. The fact that the angular size of the beam also increases with redshift (the base of the cone is at the source; the apex at the observer) is an additional effect in the same direction. This was made more explicit by Dyer & Roeder (1981b), who showed that, '[i]n the weakfield approximation, the net amplification resulting from small amplifcations due to many small spherical deflectors bending light at their perimeters corresponds to the Ricci amplification where the source and observer are located well outside the lens'. Ehlers & Schneider (1986) question several assumptions regarding the derivation of the ZKDR distance. Subsequent work has shown these doubts to be misplaced; provided that the universe has a 'ZKDR-style' mass distribution, the ZKDR distance is appropriate. When calculating the probabilities of a source being lensed, however, they point out that a random line of sight is not an average line of sight. Rather, what is random is the position of a source on the celestial sphere. They conclude that lensing probabilities had thus been underestimated. This conclusion was arrived at considering flux conservation for an ensemble of lenses; Hamana (1998) showed that it holds for individual beams also (see Sect. 22). The general idea when considering averages is that most lines of sight are underdense and this is offset by the occasional strong-lensing event.
In other words, the fact that the average amplification is 1 depends on the existence of an ensemble. On the other hand, a transparent lens neither creates nor absorbs photons. Avni & Shulami (1988) showed by an explicit calculation that this also holds for a single, isolated Schwarzschild gravitational lens; the usual amplification for small impact parameters is exactly compensated by de-amplification for large impact parameters.
Around the same time, Peacock (1986) noted that the solution given by Dyer & Roeder (1973) for arbitrary Ω 0 and η (but λ 0 = 0) is mathematically valid for η < 25/24, although η > 1 is unphysical, since this would imply that light propagates along a uniformly overdense tube. Nevertheless, this can be used as a rough model for gravitational lensing (see also Dyer & Roeder 1976). More importantly, Peacock (1986) generalized the result of Weinberg (1976) to arbitrary Ω 0 . (As far as I know, no-one has repeated this calculation for arbitrary λ 0 ). He also agrees that the conclusion of Ehlers & Schneider (1986) that a more exact treatment reveals that lensing probabilities had been underestimated, but points out that their final result is not very useful since any difference between it and previous estimates becomes significant only at large optical depth, where the single-lens approximation breaks down. (Nevertheless, it still holds that previous estimates had underestimated the lensing probability.) Fang & Wu (1989) pointed out that flux conservation can be used as a constraint when evaluating various approximations used in calculating the probability of lensing. Isaacson & Canizares (1989) compared the approach of Press & Gunn (1973) to that of Ehlers & Schneider (1986) in the Einstein-de Sitter model, finding that the former approach can be made to agree with the latter 'by adjusting the average magnifica-tion along a random line of sight so as to conserve flux'. Jaroszyński & Paczyński (1996) considered flux conservation within the context of microlensing (in which case η = 0 is appropriate, as long as any smooth mass distribution is ignored, since the lensing effect is taken into account explicitly in the microlensing calculation, as opposed to η ≈ 1 which would be appropriate if one considered the average effect for a source size larger than that of the lenses). They pointed out that in addition to the redistribution of flux, there is another redistribution of energy because some observers see an additional redshift, some an additional blueshift.

KIBBLE $ LIEU (2005)
Kibble & Lieu (2005) also contributed significantly to the understanding of flux conservation in the context of the ZKDR distance; so much so that they deserve their own section. They showed analytically that, under very general conditions (including arbitrary shapes of clumps and strong lensing), the average reciprocal magnification in a clumpy universe is the same as that in a homogeneous universe, as long as the clumps are uncorrelated. The reciprocal magnification has the advantage that it goes to zero rather than infinity on the caustics (regions of-for a point source-infinite magnification), and so is more useful in the strong-lensing case. They also discussed various measures of magnification and the circumstances in which they are appropriate.
An important distinction is whether one averages over a set of sources on the unperturbed celestial sphere, or whether one averages over all lines of sight: 'If one part of the sky is more magnified,. . . the corresponding area of the constant-z surface will be smaller, so fewer sources are likely to be found there. In other words, choosing a source at random will give on average a smaller magnification or larger angular-size distance.' This is related to whether it is the mean magnification or the mean reciprocal magnification that is the same as in the homogeneous case. In the weak-lensing case, both are. In the stronglensing case, it is the magnification which averages to 1 over the celestial sphere, the random-source average-the case implicitly considered by Weinberg (1976)-, however strong lensing effects are, while it is the reciprocal magnification which averages to 1 over all lines of sight, again however strong the lensing effects are. As a corollary, the random-source average of the total magnification of unresolved images is the same as in the homogeneous case, while for resolved images it can be significantly different, essentially because there can be more than one image of a given source.
Another distinction is between the angular-size distance and the so-called area distance (though both distances can be applied to both lengths and areas) as introduced by Ellis et al. (1998). If strong lensing is involved, i.e. multiple images (whether resolved or not) are present, then the magnification can be defined as negative for images of odd parity; sometimes, the angular-size distance itself is considered to be negative in such cases. (This is also the case for an object located at a coordinate distance χ between nπ and 2nπ, where n is an integer, because the rays defining the angle in the definition of the angular-size distance (see Sect. 2.2) cross between source and observer.) Such areas are counted negatively when calculating the average angular-size dis-tance; if the absolute values are used, the corresponding distance is the area distance, which is thus always larger than the angular-size distance. The area distance is thus appropriate if one is interested in the total number of images within a given area of sky or their average magnification; the angular-size distance is appropriate if one is interested in the total number of distinct sources (say, when multiple images are not resolved) or their average magnification.
The work of Kibble & Lieu (2005) is also important because it is analytic (though some assumptions are made, which in practice are always fulfilled to a very good approximation: the surface of constant z is the same as the surface of constant affine parameter; shear vanishes when light is propagating far from all clumps; the clumps are widely separated, slowly moving, and randomly distributed). Their work confirms that of Weinberg (1976), which is based on energy conservation, when averaging over the celestial sphere (i.e. the source is random), and also considers the case of averaging over lines of sight.

FLUX CONSERVATION 2
Wang (2000) suggested that flux conservation justifies the use of the standard distance in the analysis of the m-z relation for Type Ia supernovae and performed such flux averaging by combining data in redshift bins, pointing out that this reduces systematic uncertainties from effects such as weak lensing, while Barber (2000) claimed that weak-lensing effects are about an order of magnitude larger than previously found (and hence probably need to be taken into account more explicitly). On the other hand, Wang (2005) found only marginal evidence for weak-lensing effects in the m-z relation for Type Ia supernovae.
Even if the mean magnification is 1, due to the skewness of the distribution, the median magnification is < 1. Clarkson et al. (2012) pointed out that most narrowbeam lines of sight are significantly underdense, even for beams as thick as 500 kpc. On the other hand, they also point out that this does not necessarily lead to a increase in apparent magnitude (i.e. dimming) if one drops the assumption that inhomogeneities can be modelled as perturbations on a uniformly expanding background, a point also emphasized by Bolejko & Ferreira (2012); see also Bagheri & Schwarz (2014).
Although the basic idea of flux conservation is clear (and there are obvious caveats such as non-transparent matter), exact treatments can be very complicated and have led to confusion, much of which has been cleared up by Kaiser & Peacock (2016): Weinberg (1976) is essentially right, though one needs to keep in mind the distinction between magnification and reciprocal magnification as discussed above in connection with Kibble & Lieu (2005). Since η ∼ κ, where κ is the convergence, and µ ∼ (1 − κ) −1 , the relation is linear only in the limit of vanishing deviations, though approximately linear for the small deviations considered here. 21 Non-linear functions of the conserved quantity µ must be handled with care. For example, the average angular-size distance D A , and hence the average luminosity distance D L , is biased even in the case of µ = 1. This can be considered one aspect of the averaging problem: we are interested in the average values of the cosmological parameters determined by observers throughout the universe, but can at best average observations over several lines of sight. See also Bonvin et al. (2015), who point out that the ensemble average and the directional average do not commute; 'observing the same thing in many directions over the sky is not the same thing as taking an ensemble average'; this is a restatement of the result of Kibble & Lieu (2005). Rubin et al. (1973) noted a non-random distribution of radial velocities on the sky for a sample of galaxies, later known as the Rubin-Ford effect, and discussed various possible explanations, though none involving gravitational lensing in any form. Karoji & Nottale (1976) confirmed the effect with two samples of galaxies chosen from the literature, discussed a number of possible causes, and tentatively concluded that 'light emitted by distant galaxies are [sic] redshifted when passing through clusters of galaxies or distant sources are more luminous when seen through intermidiate clusters of galaxies which could act as gravitational lenses'. Similar work was done by Nottale & Vigier (1977). Dyer & Roeder (1976) tried to explain the Karoji-Nottale effect via η > 1. On the one hand this is straightforward: η < 1 implies that there is less matter in the beam than for a random line of sight, so η > 1 would imply that there is more. On the other hand, this situation violates the assumptions under which the ZKDR distance is calculated, so the applicability is somewhat questionable. In any case, the conclusion was that accounting for the effect of gravitational lensing by clusters of galaxies in this manner cannot explain the Karoji-Nottale effect. Swiss-cheese models were also used to estimate the effect of inhomogeneities on the CMB (e.g. Dyer 1976;Nottale 1984), but this strays too far from the main topic of this article. Nottale (1982a), in the spirit of Kantowski (1969), developed a more complicated but exact-solution model; the question is then how realistic it is physically, rather than whether the approximations are valid. While the Swiss-cheese model of Kantowski (1969) had holes consisting of completely empty voids with the mass removed from the void concentrated at the centre, and the corresponding Schwarzschild volume considered opaque, Nottale (1982a) had a more realistic model where the mass removed from the hole forms a Friedmann model of higher density than that surrounding the hole; importantly, the matter at the centre of the hole is transparent. Between the two Friedmann solutions is a Schwarzschild solution. The main conclusion here is that there is a change in the observed redshift of objects seen through such a cluster. Nottale (1982b) examined the perturbation of the magnitude-redshift relation in that model, deriving an expression for the change in magnitude dependent on the cosmological model (H 0 ,q 0 ), η, the cluster radius, the cluster redshift, and the source redshift; typical values of those parameters result in 'some tenths of magnitude'. Nottale (1983) studied this model with respect to 'the effects intrinsic to a cluster, i.e. the purely gravitational perturbations on redshift and mag-nitude (or equivalently diameter) for sources situated in a cluster, with respect to exterior sources' [emphasis in the original]. Nottale & Hammer (1984) investigated this in more detail, examining the amplification of light from distant sources by a transparent lens via an exact solution of the optical scalar equations (Sachs 1961). Nottale & Chauvineau (1986) used this formalism to calculate the global Ricci amplification by multiple gravitational lenses, noting that it usually differs significantly from the product of individual amplifications (an approximation valid only if all amplifications are small). Sato (1985) continued working with the Swiss-cheese paradigm, finding that the modification is third order in Hr b /c for redshift and first order for apparent luminosity, where r b is the radius of a void (Swiss-cheese hole). Dyer & Oattes (1988) examined the dispersion of observational quantities such as magnitudes (related to the luminosity distance) in a Swiss-cheese model, emphasizing a fundamental limit 'due to the "fuzzy" structure of the perceived past null cone' and selection effects due to the skewness of the distribution of observational quantities (even though the means are the same as for FRW). Brouyzakis et al. (2008) arrived at similar results, noting even 'inhomogeneities with sizes of order 10 Mpc or larger' cannot lead to 'dispersion and bias of cosmological parameters derived from the supernova data' large enough 'to explain the perceived acceleration without dark energy, even when the length scale of the inhomogeneities is comparable to the horizon distance'. Clifton & Zuntz (2009) investigated the effect of largescale structure on the Hubble diagram via a Swiss-cheese model. Kostov (2010) examined flux conservation in the sense of averaging over all lines of sight in Swiss-cheese models, with exact, non-perturbative calculations including all non-linear effects. Fleury et al. (2013b) suggested that the well known 'tension' between Planck and the m-z relation for Type Ia supernovae (see e.g. Conley et al. 2011, for Type Ia supernovae data) could be relieved if the calculations are done with a Swiss-cheese model. This is because the CMB data have a typical angular scale of 5 arcmin while the typical angular size of a supernova is 10 −7 arcsec. If the Swiss-cheese model is more appropriate, but a homogeneous model assumed, then one will underestimate Ω 0 . Note that at the distances used to determine H 0 , the effect of η < 1 is negligible (and would also go in the opposite direction: compared to η = 1, distances would be larger and hence the derived value of H 0 smaller 22 ). Rather, Fleury et al. (2013b) pointed out that a lower η has, to first order, the same effect as a lower value of Ω 0 (or a higher value of λ 0 ). 23 Thus, incorrectly assuming η = 1 leads to an underestimate of Ω 0 . If in fact η < 1, then the derived value of Ω 0 will be larger, while the value of H 0 changes only slightly. This reduces the tension between the values derived by Planck and the m-z relation for supernovae, though by changing the value of Ω 0 derived from the m-z relation. While the m-z relation still prefers higher values of H 0 , there is no longer any serious discrepancy with the Planck results. This interesting result is a consequence of the very detailed Swiss-cheese calculations by Fleury et al. (2013a). Alas, as pointed out by Betoule et al. (2014), it appears that the low value of Ω 0 obtained by Conley et al. (2011) was due to a wrong calibration of the MegaCam zero points in the g and z bands and corrections to the MegaCam r and i filter bandpasses, thus the analysis by Fleury et al. (2013a) is in some sense no longer relevant (though one could turn it around and see the lack of tension in Ω 0 as evidence against such extreme Swiss-cheese models). Although interesting because they are exact solutions to the Einstein equation, Swiss-cheese models are today arguably mainly of historical interest. In particular, the redshift aspects should not be worrying, since they are merely one aspect of the integrated Sachs-Wolfe effect, which can be calculated for a CDM-like power spectrum, now known empirically to be a good approximation.

SWISS-CHEESE MODELS
Fleury (2014) demonstrated with completely analytic arguments the equivalence of the ZKDR distance and that calculated from a certain class of Swiss-cheese models at a well controlled level of approximation. This had been known for a long time based on comparisons of numerical results, but of course an analytic proof is very important. Since the Swiss-cheese models are exact solutions of the Einstein equations, this means that there can be no problem using the ZKDR distance, as long as one makes the reasonable assumption that the mass at the centre of a Swiss-cheese hole is effectively opaque and reasonable assumptions about the order of magnitude of the mass and compactness of the clumps. (Of course, as discussed in Sect. 27, even if there can be no debate that the ZKDR distance is appropriate if a universe has the corresponding mass distribution, it is another question whether our Universe does indeed have such a mass distribution, even approximately.) He also stressed that the Etherington reciprocity relation (Eq. (2)) holds for any spacetime in which the number of photons is conserved, a point which is sometimes misunderstood. The present work is concerned with the theory and applications of the ZKDR distance, assuming that it is correct. Fleury (2014) has written the definitive paper on the justification of the ZKDR distance; it and references therein should be consulted for those interested in details. Peel et al. (2014Peel et al. ( , 2015 examined the effcts of inhomogeneities on distance measures in a Swiss-cheese model, concentrating on the distance modulus. Their model is more general because the holes are non-symmetric structures described by the Szekeres (1975) metric (in general inhomogeneous and anisotropic). This allows an exact description which includes non-trivial evolution of structure. Interestingly, the standard deviation for dispersions ∆µ was found to be 0.004 ≤ σ ∆µ ≤ 0.008, smaller than the intrinsic dispersion of magnitudes of Type Ia supernovae. Lavinto & Räsänen (2015) examined the CMB as seen through random Swiss cheese. Usually, 'closed' holes had been examined, i.e. an overdense centre surrounded by an underdensity. Lavinto & Räsänen (2015) examind 'open' holes as well, i.e. an underdense void surrounded by a thin overdense shell. This is arguably a better model of our Universe, though of course still an approxima-tion. The size of the holes corresponds to galaxy clusters. There is no statistically significant systematic shift in the angular-diameter distance, with a 95-per-cent upper limit of |∆D A /D A | < 10 −4 , and larger values reported in the literature are shown to be due to selection effects.
Observed inhomogeneities in the CMB are caused by a combination of primordial inhomogeneities and the effects of inhomogeneities on light propagation. Since the relevant angular scales are much larger than those involved in the ZKDR distance, further discussion of CMB anisotropies is beyond the scope of the present work. Lavinto & Räsänen (2015), apart from presenting original results, also gave a good review of this topic and its connection to the ZKDR distance.

GRAVITATIONAL LENSING: TIME DELAYS
The basic observational quantities in a strong (e.g. multiple-image) gravitational lens system-angles, flux ratios-are dimensionless, except for the time delays between pairs of images (Refsdal 1964). This allows one to determine the Hubble constant from a measurement of the time delay, assuming a mass model for the lens. However, this is true only in the low-redshift limit; at higher redshift, the cosmological model plays a role (Refsdal 1966). The cosmological parameters Ω 0 and λ 0 are now known very well from cosmological tests other than gravitational-lensing time delays (e.g. Planck Collaboration 2014, 2016, 2019); one could thus assume them to be exactly known and use observations related to cosmological distances to determine η (e.g. Helbig 2015a). 24 Within the uncertainties as they were 35-40 years ago, for the angular-size distance, at low redshift the values of Ω 0 and λ 0 are more important, while η becomes more important at high redshift (e.g. figure 1 in KHS). Due to the different combination of angularsize distances, for lensing statistics the effect of η tends to cancel (e.g.  while in the case of gravitational-lensing time delays the importance of η is enhanced even at lower redshift (e.g. Kayser & Refsdal 1983;Helbig 1997). Kayser & Refsdal (1983) illustrated this dramatically for several world models with λ 0 = 0, comparing the η = 1 and η = 0 cases. For the double quasar 0957+561 (Walsh et al. 1979), the cosmological correction factor (which gives the influence of the cosmological model compared to the limiting low-redshift case) was calculated for σ 0 values ranging from 0 to 2 (corresponding to 0 ≤ Ω 0 ≤ 4) with q 0 values of 1.0, 0.5, 0.0, and −1 (λ 0 = σ 0 − q 0 ). Helbig (1997) repeated the exercise for arbitrary combinations of λ 0 , Ω 0 , and η, again showing the importance of η, which has become even more important now that the values of λ 0 and Ω 0 are so well known.
A somewhat more complicated model (not neglecting shear) was investigated by Alcock & Anderson (1985), for λ 0 = 0 (not stated but assumed) and Ω 0 values of 0 and 1, using two gravitational-lens systems as concrete examples. They stressed the fact that ignorance of the 24 The data from these other tests cannot usefully constrain Ω 0 , λ 0 , and η simultaneously Helbig 2015a), not even if one restricts the analysis to a flat universe; the same is true of similar tests involving the angular-size-redshift relation . mass distribution along the line of sight makes it difficult to determine the Hubble constant by this method, but also that, once the Hubble constant is known via other means, this method could be used to learn something about the mass distribution. Similar results were obtained by Watanabe et al. (1993).
Usually one thinks of the possibility of determining H 0 or, if H 0 is known, other cosmological parameters from a measured time delay and mass model for the lens. Narayan (1991) pointed out that the measurement actually gives one the angular-size distance between observer and lens (which, if the redshift of the lens is known, is easily converted into the Hubble constant). Of course, this depends on η, but since lens redshifts are usually low, the effect of η is limited. Giovi & Amendola (2001) examined a more general quintessence model where, in addition to ordinary matter ('dust') there is a perfect fluid with equation of state p = ( m 3 − 1)ρ with 0 ≤ m < 3. The case m = 0 corresponds to the cosmological constant while m = 3 corresponds to ordinary matter; m < 2 implies that the universe is accelerating (as long as the quintessence term dominates). However, only k = 0 models are considered. One might think that this is justified since the Universe does seem to be very close to being flat (e.g. Planck Collaboration 2014, 2016, 2019); however, such an interpretation usually assumes that m = 0. Nevertheless, all known analytic solutions within this framework are presented (except one which 'is so complicated that it is not worth reporting'). Other cases are calculated numerically. Including quintessence usually reduces the estimated value of H 0 compared to the standard m = 0 case. Marginalizing over Ω 0 and m for the time delays considered results in H 0 = 71 ± 6 and H 0 = 64 ± 4 km/s/Mpc for the cases η = 0 and η = 1, respectively. Considering the facts that there is no evidence at all for values of m other than 0 (the cosmological constant) and 3 (dust), apart from radiation with m = 4 which, however, is important only in the early Universe, and that η = 1 is obviously not correct (at least in the strict sense), I find it somewhat disconcerting that there are a large number of papers investigating the possible effects of quintessence on the interpretation of cosmological observations compared to the number which discuss the influence of η.
While the idea is simple in principle (Refsdal 1964), in practice many details need to be taken into account when determining H 0 from gravitational-lens time delays (especially if the uncertainties should be small enough to be competitive with other methods), such as measuring the time delay itself and determining realistic uncertainties (e.g. Biggs & Browne 2018) and constructing a realistic mass model for the lens (e.g. Wong et al. 2016;Rusu et al. 2019). At this level of detail, characterizing the density along the line of sight by a single parameter η, or even η(z), is too coarse. Rather, one attempts to measure the mass distribution explicitly, by counting galaxies (e.g. Rusu et al. 2017) or using weak gravitational lensing (e.g. Tihhonova et al. 2018). Schneider (1984) showed that a general transparent mass distribution always leads to amplification of at least one image compared to the case of an η = 0 universe (i.e. compared to the case that the lens were absent, not compared to the case that its mass is smoothly distributed throughout the universe). Of course, this is not in contradiction with the result of Weinberg (1976) that there is no mean amplification compared to a homogeneous universe, a point also emphasized by Nottale & Hammer (1984, see Sect. 19) and Hammer (1985).

GRAVITATIONAL LENSING: AMPLIFICATION
Of course, all discussion of the ZKDR distance involves (negative) amplification, and in general all gravitational lensing involves amplification. Gravitational lensing has a huge literature which is beyond the scope of the present work. Therefore, I discuss here only those aspects of gravitational lensing which are directly related to the ZKDR distance, are interesting for other reasons, or in which I was personally involved. One example of the last is a study (Zackrisson et al. 2003) which demonstrated that various claims (Hawkins 1993(Hawkins , 1996(Hawkins , 1997Hawkins & Taylor 1997) that most dark matter must be in compact objects of about a solar mass-because this is assumed to be responsible for most of the long-term optical variability of QSOs via microlensing-cannot be correct. In short, while arguments were presented that many of the observations are not only compatible with microlensing but also have no other obvious explanation, there are nevertheless other observations which contradict this hyposthesis, in particular the distribution of amplifications.

GRAVITATIONAL LENSING: GENERAL
Alcock & Anderson (1986) qualitatively discussed the optical scalars-implying a model more complicated than the ZKDR distance-and the possibility to learn something about distribution of mass in the universe from the distance measures derived from gravitational-lens systems. (Often the reverse is done: one has some model to calculate the distance as a function of redshift, and uses this as input for modelling the lens system.) Perhaps because in the case of gravitational lensing it is obvious that there are small-scale inhomogeneities which affect light rays (i.e. the gravitational lenses themselves), the ZKDR distance and similar topics were discussed earlier and more often than in other areas, even though their role there could be just as important. Lee & Paczyński (1990) investigated gravitational lensing by three-dimensional mass distributions, finding that 16 screens are a sufficiently good approximation. Their conclusion that 'the distribution of amplifications of single images is dominated by the convergence due to matter within the beam' and that '[t]he shear caused by matter outside the beam has no significant effect'-even in the case of strong lensing-increases one's confidence that the zero-shear ZKDR distance is a realistic approximation (at least in a universe with the corresponding mass distribution). Although their goal was not to test the ZKDR approximation, their work could be seen as an early comparison of the ZKDR distance with numerical simulations. Jaroszyński et al. (1990) numerically studied gravitational lensing in the Einstein-de Sitter model, also concluding that shear can be neglected but also that the filled-beam approximation (η = 1) appears to be justified, at least for strong lensing by galaxies or clusters of galaxies. However, 'the column density was averaged over a comoving area of approximately (1h −1 Mpc) 2 ', so this could be a self-fulfilling prophecy, together with the fact that they found no case of strong lensing. Nevertheless, it does seem to be the fact that 'the large-scale structure of the universe as it is presently known does not produce multiple images with gravitational lensing on a scale larger than clusters of galaxies'. The same conclusion, namely that Weyl focussing can be neglected compare to Ricci focussing, was also found by Hamana (1999) to apply to a universe modelled as randomly distributed isothermal objects. It thus appears that the ZKDR distance, which is based on a very simple model, is also valid in more-realistic models, confirming a result of Nakamura (1997) based on solving the optical-scalar equation for light passing through linear inhomogeneities in CDM models.  and  derived the gravitational-lens equations in an 'on average' Friedmann universe, in particular one with the mass distribution (smooth component with clumps) used in the derivation of the ZKDR distance. This very detailed work is an analytic complement to the numerical investigations mentioned above regarding the effects of inhomogeneities on the propagation of light beams; in particular, necessary approximations are made clear, lending support to the idea that the ZKDR distance is an acceptable approximation.
Gravitational-lensing statistics (e.g. Turner et al. 1984;Fukugita et al. 1990Fukugita et al. , 1992Falco et al. 1998;Kochanek 1993Kochanek , 1996aKochanek et al. 1995;Helbig et al. 1999;Chae et al. 2002) is usually not concerned with η. Apart from the general neglect of η in observational cosmology, there are probably several reasons for this. First, such studies are usually concerned with all-sky surveys, so one might expect η to 'average out' to 1 (Weinberg 1976). Second, in the relevant combination of angular-size distances, the effect of η tends to cancel out (in contrast to the situation regarding time delays). Third, while selection effects are important in such analyses, selection effects due to the value of η are smaller than others. Fourth, any effect of η would, in practice, be degenerate with other effects. Covone et al. (2005) found that the expected number of gravitationally lensed quasars is a decreasing function of η; Castañeda & Valencia (2008) investigated strong lensing (by galaxy clusters) with η = η(z) as a means of taking structure formation into account. 25 Asada (1998), by contrast, assumed the validity of the ZKDR distance and used it to investigate how inhomogeneities affect observations of gravitational lenses, in particular bending angle, lensing statistics, and time delay. An interesting analytic result is that all three combinations of distances 26 involved in these phenom-25 Note that one expects η to increase with z for two reasons when the angular-size distance is concerned. First, structure formation implies that the universe is more homogeneous at higher redshift. Second, for a fixed angle at the observer, the physical size of the object observed increases with redshift (as long as the redshift is lower than that of the maximum in the angular-size distance), so one averages over a larger volume at higher redshift. Both effects exist for the luminosity distance as well. 26 The combinations are D ds /Ds, D d D ds /Ds, and D d Ds/D ds , respectively. The subscripts refer to the deflector (lens) and source. In the case of only one subscript, it is the second, the first being understood to refer to the observer. This is probably the most ena are monotonic with respect to the clumpiness for all combinations of λ 0 , Ω 0 , and source and lens redshifts. The clumpiness decreases the bending angle and number of strong-lensing events and increases the time delay. (Of course, not all combinations are monotonic in η, but physically relevant ones are.) In the first two cases, decreasing η has the same effect as decreasing λ 0 . In other words, using a value of η which is too large (such as the common assumption η = 1) would lead one underestimate the value of λ 0 . 27 (In the conclusions, this is confusingly stated as 'the use of the DR distance always leads to the overestimate of the cosmological constant' [emphasis in the original]; of course, it is not an overestimate but rather the correct estimate if the correct value of η for the ZKDR distance is used.) More detail was provided by Tomita et al. (1999).
At almost the same time (publication was one month later) and completely independently, Helbig (1998) investigated not the common gravitational-lensing topics mentioned above, but rather the correlation between image separation and source redshift, in a reply to the work of Park & Gott (1997) who had noted a negative correlation. Helbig (1998) showed that decreasing η has the same effect as decreasing K := λ 0 + Ω 0 − 1 (i.e. this effect is also monotonic in η); also, decreasing η reduces the differences between cosmological models characterized by λ 0 and Ω 0 . The strong negative correlation reported by Park & Gott (1997), though, seems to be based on an unclean data sample and also is not statistically significant.
It had been known for some time (e.g. Schneider et al. 1992;Ehlers & Schneider 1986) that gravitationallensing magnification as calculated using the standard distance is smaller than that using the ZKDR distance by a factor of the square of the ratio of the corresponding distances, a result derived by averaging magnifications over a number of sources and making use of flux conservation. Hamana (1998) showed that it is actually true not just on average but for each individual ray bundle as well. Refsdal (1970) had studied numerically the propagation of light in an inhomogeneous universe (see Sect. 5). This technique was expanded by Schneider & Weiss (1988a,b). Pei (1993a,b) showed that, to a reasonable approximation, the effect of multiple lenses can be calculated by multiplying the individual amplifications. Tomita (1998) used N-body simulations with the CDM power spectrum in four cosmological models to investigate the behaviour of angular-diameter distances in inhomogeneous cosmological models, determining η for each pair of rays and investigating the mean and dispersion of η. Further studies along these lines (e.g. Premadi et al. 1998Premadi et al. , 2001Martel et al. 2002;Premadi et al. 2004Premadi et al. , 2008) involving ray shooting common notation. Other schemes explicitly write the first subscript when it refers to the observer as well, use 'l' instead of 'd' to refer to the lens (deflector), use capital letters, or some combination of these. The same subscripts are used to refer to the corresponding redshifts, e.g. zs, though sometimes z d is used in the sense of a variable and z l to refer to the redshift of an explicit gravitational lens.

MONTE-CARLO SIMULATIONS
27 Note that this is opposite the effect in the m-z relation.
through N-body simulations with the explicit calculation of the paths of (bundles of) light rays, while interesting, are too far removed from the main topic of the present article for further discussion.
24. CLASSICAL COSMOLOGY: REDSHIFT-VOLUME RELATION Omote & Yoshida (1990) examined the effect of statistical gravitational amplification on the cosmological redshift-volume test, in particular its influence on the derived value of Ω 0 , using the extreme η = 0 model to examine the data of Loh & Spillar (1986), concluding that their derived value of Ω 0 is smaller, i.e. η and Ω 0 are positively correlated. 28 Of course, there are much better data today, Loh & Spillar (1986) neglected galaxy evolution, and so on; nevertheless, this work demonstrates the effect of η on the redshift-volume test. Wu (1998) suggested that interest in the ZKDR distance had subsided after Weinberg (1976) had shown that flux conservation implies that, on average, there is no amplification. 29 He then points out that the fact that the luminosity distances in the homogeneous and inhomogeneous cases are the same on average does not mean that apparent magnitudes are the same in both cases. This is illustrated with a simple model. More important than the model are the conclusions: because most lines of sight are underdense, compensated by the occasional large amplification, the apparent magnitude is essentially a random variable; also, the value of q 0 obtained depends on the value of η assumed, or, vice versa, one could use the m-z relation to determine η if the cosmological parameters are known with some degree of certainty. Rose (2001) pointed out that the argument of Weinberg (1976) does not hold if the sphere centred on the observer is affected by the mass distribution, concluding that, in a perturbed FRW universe, 'more photons from a source at a given redshift' will be received than in an FRW universe, i.e. the sources are brighter. Somewhat confusingly, it is claimed that they 'therefore have a higher apparent magnitude', which is correct if 'higher' means 'brighter', but of course larger magnitudes correspond to fainter objects. However, this is a second-order effect; to first order, small deviations from homogeneity do not change the average magnification (Claudel 2000).

RELATION
Although going somewhat beyond the simple approximation of the ZKDR distance, Watanabe (1992Watanabe ( , 1993 28 Note that in the simpler case of the m-z relation, η and Ω 0 are negatively correlated. This is easy to understand, since both a higher value of η and a higher value of Ω 0 mean that more matter is in the beam. In the redshift-volume test, both the apparent magnitude and the volume (which is independent of η) are involved, the luminosity function plays a role, etc., making the test much more complicated; (e.g. Sandage 1995). Also, all mass was assumed to be in point masses with regard to the gravitational-lens effect. Yoshida & Omote (1992) performed a similar study using the model of a spherical opaque lens, arriving at similar conclusions.
29 This is not my impression. There was a slow trickle of papers up until about 1982, after which the number per year increased each year. This appears to be mainly data-driven, with a large increase after the measurement of the m-z relation for Type Ia supernovae.
investigated the effects of an inhomogeneous universe on another classic cosmological test, namely the magnitudenumber relation (see e.g. Sandage 1995, for details), also checking the validity of the assumptions used by Omote & Yoshida (1990) (see Sect. 24). These sorts of cosmological tests have gone out of fashion, primarily because the uncertainty in the evolution of the sources is too large, leaving the m-z relation for Type Ia supernovae, baryon acoustic oscillations (BAO), and the CMB as the most useful cosmological tests. It is not yet possible to calculate galaxy evolution from first principles, and observations of it have to be interpreted within the context of an assumed cosmological model, so now such classic tests are useful mainly as consistency checks.

RELATION
One of the most important advances in observational cosmology has been the application of the m-z relation to Type Ia supernovae. 30 In an influential paper, Colgate (1979) had suggested using the Hubble Space Telescope for that purpose. Goobar & Perlmutter (1995) discussed the feasability of such a programme, and were later involved in the Supernova Cosmology Project, which reported measurements of λ 0 and Ω 0 based on 42 supernovae (Perlmutter et al. 1999;Knop et al. 2003), a result confirmed and published slightly earlier by the High-z Supernova Search team (Riess et al. 1998;Schmidt et al. 1998). While there had been hints, based on joint constraints from several cosmological tests, not only that the cosmological constant is positive but also that it has such a value that the Universe is currently accelerating (Ostriker & Steinhardt 1995;Krauss & Turner 1995), the m-z relation for Type Ia supernovae was the first cosmological test which, by itself, confirmed such a value for λ 0 . (Contrary to some claims, this test does not 'directly' measure acceleration in any meaningful sense, even if one does not adopt the extreme view that all that is ever 'really' measured in observational astronomy, whether in imaging or in spectroscopy, are photon counts as a function of position on a detector.) Perlmutter et al. (1999) also checked for the influence of η, using the Fortran code of KHS to compare the standard distance to that of two other models, one with η = 0 and the other with η = η(Ω 0 ), the latter based on the idea that all matter is in clumps for Ω 0 ≤ 0.25 and for Ω 0 ≥ 0.25 the fraction 0.25/Ω 0 is in clumps, thus η = 0 for Ω 0 ≤ 0.25, otherwise η = 1 − 0.25/Ω 0 . Their conclusion, based of course on their data at the time, is that significant differences occur only for models ruled out by other arguments, i.e. Ω 0 > 1. Kantowski et al. (1995), still using the soon-to-beobsolete q 0 -notation, had pointed out that η should be taken into account when discussing the m-z relation for Type Ia supernovae. They also presented an analytic solution for λ 0 = 0 but arbitrary Ω 0 and q 0 , and introduced the parameter ν: 30 The m-z relation for Type Ia supernovae has spawned an extensive literature; in this review, I mention only those aspects of it directly concerned with the ZKDR distance. Many good reviews are available (Riess 2000;Leibundgut 2001;Schmidt 2002;Perlmutter & Schmidt 2003;Filippenko 2005;Leibundgut 2008).
due to the fact that there are analytic solutions for certain integer values of ν. Frieman (1996) disputed the importance of the effect, arguing that the Swiss-cheese model is not a valid model for the distribution of mass in the Universe, and that the uncertainty due to η would be smaller; Kantowski et al. (1995) disagree. Frieman (1996) emphasized the dispersion in the apparent magnitude of supernovae caused by a given mass distribution, rather than considering a range of η. A similar approach, with the aim of determining the density of compact objects, the properties of galaxy haloes, or estimating the uncertainty in the measurement of λ 0 and Ω 0 , was taken up by many authors (e.g. Holz 1998;Seljak & Holz 1999;Metcalf & Silk 1999;Valageas 2000;Mörtsell et al. 2001;Minty et al. 2002;Amanullah et al. 2003;Payne & Birkinshaw 2004;Metcalf & Silk 2007;Dodelson & Vallinotto 2006;Yoo et al. 2008;Jönsson et al. 2010;Ben-Dayan & Takahishi 2016;Zumalacárregui & Seljak 2018). Rather than calculating the dispersion, one could also attempt to measure it indirectly due to the fact that the same matter fluctuations would cause weak lensing. However, the shear maps smoothed on arcminute scales are not of much use since an appreciable fraction of the lensing dispersion derives from sub-arcminute scales (Dalal et al. 2003). Another approach is to estimate the amplification from the matter visible along the line of sight; Jönsson et al. (2006Jönsson et al. ( , 2007Jönsson et al. ( , 2008 and Smith et al. (2014), building on ideas by Gunnarsson et al. (2006), found a tentative detection, i.e. a correlation between the computed and observed amplification (difference between the observed flux and that expected from the redshift in the concordance model). One can also turn this around, and use the observed matter distribution to estimate the amplification due to lensing and thus correct the observed flux (Jönsson et al. 2009). Iwata & Yoo (2015) took a somewhat different approach, assuming a flat universe and taking Ω 0 from CMB measurements, then calculating η(z) such that the cosmological parameters from the m-z relation for Type Ia supernovae agree; this was done for four different scenarios. This is complementary to the work of Helbig (2015a) (next paragraph) who, at almost exactly the same time, considered only constant η but for arbitrary FRW models, determining the value of η such that the m-z relation for Type Ia supernovae results in the same values for λ 0 and Ω 0 as those derived from the CMB. Helbig (2015a) investigated the influence of η, noting that more and higher-redshift data had become available. While the data were not good enough to determine λ 0 , Ω 0 , and η simultaneously 31 , the constraints in the λ 0 -Ω 0 plane depend strongly on η. Only by assuming η ≈ 1 does one recover the concordance-cosmology values of λ 0 ≈ 0.7 and Ω 0 ≈ 0.3. Since these values are now known to high precision independently of the m-z relation for Type Ia supernovae (e.g. Planck Collaboration 2014, 2016, 2019, one can use the m-z relation for Type Ia supernovae to measure η. The result η ≈ 1 agrees well with other tests to determine η from observations. (While no useful constraints are possible, the global maximum likelihood in the λ 0 -Ω 0 -η cube also indicates a high value of η.) Unknown to me at the time, very similar results, based on the same data, were obtained by Yang et al. (2013), Bréton & Montiel (2013), and, somewhat later, Li et al. (2015) (the latter two restricted to a flat universe). While perhaps not surprising, it is of course important in science for results to be confirmed by others working independently. Although they investigated a wider range of models, when restricted to standard FRW models, the results of Dhawan et al. (2018) are also consistent.
Since the observations indicate that η ≈ 1, one can ask whether this is true 'on average' as discussed by Weinberg (1976), or whether each line of sight indicates η ≈ 1. In the former case, one would expect a dispersion in the distance at high redshift. Indeed, the scatter does increase with redshift, but so do the observational uncertainties. Since their quotient is independent of redshift, this indicates that each line of sight indicates η ≈ 1, in other words that all lines of sight fairly sample the mass distribution of the Universe 32 (Helbig 2015b). Note that Holz & Linder (2005) find a scatter (calculated theoretically) approximated by a Gaussian with standard deviation σ eff = 0.088z (in flux) or σ eff,m = 0.093z (in magnitudes). However, as discussed by Helbig (2015b), the observed increase in scatter with redshift seems primarily due to observational uncertainties in addition to the theoretically calculated scatter sometimes incorporated into those uncertainties.

MORE-DETAILED MODELS
Holz & Wald (1998) developed a generalization of the Swiss-cheese approximation by including all mass explicitly (thus there is no smoothed-out 'cheese' component), requiring the mass within a given spherical region (corresponding to a hole in the Swiss-cheese approach) to be equal to that of the background FRW model only on average, and dropping the requirement of spherical symmetry. In addition, rather than having a fixed mass distribution and calculating the trajectories of photons within it, the mass distribution along a given trajectory is calculated on the fly. Also, no opaque-radius cutoff is imposed. Such a model is clearly more realistic than that of Zel'dovich or a Swiss-cheese model, and leads to a distribution of apparent luminosities at a given redshift. In principle, the shape of such a distribution can be used to determine both the background FRW model and the fraction of matter in compact objects. While there are a few highly amplified sources (which, due to flux conservation, there must be, in order to compensate for the fact that most sources are de-amplified), most of the distribution can be thought of as η varying with position on the sky. As expected, if thought of in terms of η, η increases with redshift, as the higher the redshift, the more likely it is that a typical trajectory crosses a fair sample of the universe. Bergström et al. (2000) generalized the method of Holz & Wald (1998) by allowing for different types of flu-ids, possibly with non-vanishing pressure, instead of just dust, and by considering the NFW profile (Navarro et al. 1997) in addition to point masses and singular isothermal spheres as lenses (see also . Also, multiple imaging is taken into account. This is thus an even more complicated and thus more realistic model of the universe. As a consistency check, their results for empty cells and cells with a homogeneous dust component were compared with results obtained from the code of KHS for η = 0 and η = 1, respectively. For a variety of cosmological models, the discrepancy was less than 1 per cent up to z = 10. This is a further justification that the ZKDR distance is an excellent approximation provided that the mass in the universe is distributed according to the assumptions underlying the ZKDR distance. They also found analytic approximations which are very good representations of various observable quantities, such as magnification distributions. Mörtsell (2002) used essentially the same scheme to investigate the relation between η and the fraction of compact objects. By definition, 1 − η is the fraction of compact objects f c in the pure ZKDR case, i.e. only de-amplification due to underdensity and no amplification due to gravitational lensing. As expected, taking lensing into account results in 1 − η < f c . Interestingly, for a variety of cosmological models ((Ω 0 , λ 0 ) = (0.3, 0.6), (0.2, 0.0), (1.0, 0.0)), for redshifts between 0 and 3, and for various models of the mass distribution (homogeneous and point masses, NFW profiles and point masses), the relation is approximated very well by 1 − η ≈ 0.6f c .
Some authors have claimed that that a universe with large-scale inhomogeneities could appear as if it has a positive cosmological constant when in fact it doesn't, either because the m-z relation mimics that of an accelerating model (e.g. Alnes et al. 2006;Garfinkel 2006) and/or because the inhomogeneities produce accelerations without a cosmological constant (e.g. Kai et al. 2007). However, Vanderveld et al. (2006) present evidence against these claims. Also, while in principle one can reproduce an arbitrary m-z relation with an ad hoc mass distribution, there are two arguments against this, other than the fact that it is ad hoc-or, equivalently, of all possible m-z relation which could be produced, it just so happens that one is produced which is not only explicable with 1920s cosmology, but also where the derived parameters agree with those determined by other means-: there is no believable route to explaining the CMB observations, and we are required to be at or near the centre of a large and approximately spherical region. Those topics go beyond the scope of this article, so I don't discuss them further here. However, it has even been claimed that this is possible in a Swiss-cheese universe (e.g. Marra et al. 2007Marra et al. , 2008. Vanderveld et al. (2008) showed, however, that this is not the case if the voids have a random distribution.
Flanagan et al. (2012) used a variant of the method of Holz & Wald (1998) to calculate the distribution of magnitude shifts, but using a simplified Swiss-cheese model for the mass distribution. Flanagan et al. (2013) extended this with a more refined Swiss-cheese model: the mass removed to make the voids is distributed on shells surrounding the holes in the form of randomly located NFW haloes and in the interior of the holes (either smoothly distributed or as randomly located haloes). Hada & Futamase (2014) carried out a similar exercise, concentrating on the difference between the magnituderedshift relation in a homogeneous universe and that in an inhomogeneous universe (with a mass distribution given by the non-linear matter power spectrum), as well as its dispersion, taking into account the blocking effect by collapsed objects and examining the resulting uncertainty in Ω 0 (≈ 0.4) and the equation of state w (≈ 0.04), all in a flat universe.
The work by Giblin et al. (2016a,b) and Mertens et al. (2016) has been mentioned above in Sect. 12; a similar approach was adopted by Bentivegna & Bruni (2016). Detailed discussion of such work is of course beyond the scope of this review, which concentrates on the use of the ZKDR distance as opposed to the standard distance when calculating distance from redshift for a given cosmological model. Nevertheless, for present purposes such works are interesting because they allow for comparison between the ZKDR distance and much more realistic simulated matter distributions, making it possible to see how well the ZKDR ansatz approximates reality. However, such simulations are still not entirely free of approximations: those above are fully relativistic but use the fluid approximation, while a different approach was adopted by Adamek et al. (2016), which does not rely on the fluid approximation, but on the other hand is based on a weak-field expansion of GR. Which approach is better of course depends on what one wants to study. It is perhaps surprising that a simple equation such as Eq. (1) agrees so well with results from numerical ray tracing through ΛCDM simulations, at least if one allows the additional freedom of η(z) and a certain stochastic element depending on the individual line of sight (η(α, δ)). Somewhat similarly, the FRW metric was originally a simplifying assumption, made in order that at least some results could be obtained with the limited methods of calculation available at the time. Now, however, it is an observational fact, as demonstrated by observations of the CMB and the large-scale structure of the Universe, that our Universe is in fact very close to an FRW model (Green & Wald 2014).

WEAK GRAVITATIONAL LENSING
Weak gravitational lensing is normally defined as gravitational lensing without multiple images. If the source can be resolved, then information can be gleaned from the distortion of the image. In such a case, however, if the source is at a cosmological distance, η ≈ 1 (because the distance implies a large physical extent near the source, averaging over the matter distribution, and because it appears that, at large redshift, distances behave as if η ≈ 1, as noted in Sect. 27). Relevant for the ZKDR distance with respect to weak lensing is thus weak lensing of point sources. 33 Some aspects of this are discussed above in Sect. 21. This section is concerned particularly with weak lensing of standard candles. Wang (1999) pointed out that weak lensing leads to a non-Gaussian magnification distribution of standard candles at a given redshift, due to the fact that η can vary with direction. One can thus think of our Universe as a mosaic of cones centred on the observer, each with a different value of η, where there is a unique mapping between η and the magnification of a source. Of course, since the ZKDR distance depends on Ω 0 and λ 0 as well as η, different cosmological models can lead to very different magnification distributions for the same matter distribution. 34 Wang (1999) derived an approximation for the ZKDR distance (see Sect. 32), and also treated η as a function of position on the sky, i.e. different lines of sight can have different values of η. This effective value of η depends not only on the amount of matter in the beam, but also on how it is distributed (though only the total amount in the beam is considered-the possibility that a significant fraction could be in point masses is not taken into account). An approximation to matter distribution at a given redshift is found via comparison with the results of Wambsganss et al. (1997), who used Ω 0 = 0.4 and λ 0 = 0.6. She then calculated the distribution of η as well as the magnification distribution for standard candles, both for the same three different redshifts 0.5, 2, and 5. Also, for the same matter distribution, the probability of magnification was calculated for the same three redshifts and three different cosmological models: (Ω 0 ,λ 0 ) = (1,0), (0.2,0), and (0.2,0.8). Wang et al. (2002) extended this idea to a universal probability-distribution function for the reduced convergence which can be directly computed from Ω 0 and λ 0 , well approximated by a three-parameter stretched Gaussian distribution, where the three parameters depend only on the variance of the reduced convergence; in other words, all possible weak-lensing probability distributions can be well approximated by a oneparameter family, which was normalized via the simulations of Wambsganss et al. (1997). The reduced convergence is the same as the direction-dependent η used by Wang (1999). Fitting formulae were presented for thre fiducial cosmological models: (Ω 0 , λ 0 , h, σ 8 ) = (1.0, 0.0, 0.5, 0.6), (0.3, 0.7, 0.7, 0.9), (0.3, 0.0, 0.7, 0.85). Williams & Song (2004) took the opposite approach: assuming that the standard distance (η = 1) is correct, they found that bright SNe are preferentially found behind regions (5-15 arcmin in radius) that are overdense in the foreground due to z ≈ 0.1 galaxies, the difference between brightest and faintest being about 0.3-0.4 mag. (In other words, the fact that bright supernovae are preferentially found behind overdense regions indicates that the standard distance is incorrect.) The effect, significant at > 99 per cent, depends on the amount and distribution of matter along the line of sight to the sources but not on the details of the galaxy-biasing scheme.
In a very detailed work, Kainulainen & Marra (2009) 34 Note that her claim that Perlmutter et al. (1999) 'assumed a smooth universe' is somewhat misleading. While they did not consider a direction-dependent η, they did compare the extreme cases of η = 1 and η = 0 as well as the case of an Ω 0 -dependet η (i.e. galaxies assigned to clumps and the rest of the matter distributed smoothly, which implies an increase in η with increasing Ω 0 ), in all cases using the code of KHS. studied the effects of weak gravitational lensing caused by a stochastic distribution of dark-matter haloes, restricted to flat FRW models and examining those with Ω 0 = 0.28 (close to the current concordance model) and Ω 0 = 1 (the Einstein-de Sitter model) as representative examples. In particular, they calculated the difference between the distance in their model and the ZKDR distance for η = 0.5 and η = 0 for these two models, finding a maximum relative error of only 0.06 for the extreme case of the empty-beam Einstein-de Sitter model at z = 1.6 (the upper limit of their redshift range). This is yet another example of the proof of the validity of the assumptions underlying the ZKDR distance. Their main goal was to compute the probability-distribution function and the most likely value of the lens convergence along arbitrary photon geodesics as a function of their model parameters.

CLASSICAL COSMOLOGY: GENERAL
In an interesting but somewhat confusingly written paper, Yu et al. (2011) use the m-z relation and the angular-size-redshift relation (based on data from the literature) to determine Ω 0 and η in flat cosmological models (and the equation-of-state parameter w-confusingly referred to as ω-and η for flat models with Ω 0 = 0.28). Of course, H is in general a function of z, but this is not something which is measured directly. 35 Rather, H(z) is calculated from the magnitude or angular size. Although not stated, presumably the reason is to be able to fit to both data sets simultaneously. Their results (1-σ uncertainties) η = 0.80 +0.19 −0.2 (with no prior on Ω 0 ) and η = 0.93 +0.07 −0.19 (Ω 0 = 0.26 ± 0.1) can be compared to η = 0.75 +0.15 −0.15 (λ 0 = 0.72 and Ω 0 = 0.28, i.e. the confordance-model values) obtained by Helbig (2015a) using only the m-z relation for Type Ia supernovae (see Sect. 27). Although not directly comparable, and keeping in mind that one would expect the supernova data to indicate a lower value of η due to the smaller beam size, the general trend is clear: observational data indicate a relatively high value of η. There are two possible explanations. First, this could be the averaging mentioned by Weinberg (1976), skewed to slightly lower values because of selection effects. Second, the physical model on which the ZKDR distance is based is wrong, but in our Universe the m-z relation is similar to that in a highη ZKDR universe (Peel et al. 2014;Helbig 2015b) (see Sect. 27). Busti & Santos (2011) pointed out that the procedure used by Yu et al. (2011) to calculate H(z) is not consistent, because the equation relating H(z) and the angularsize distance is valid only for η = 1.  had done a similar analysis to that of Yu et al. (2011) 35 There is a range of directness in measurement. At one level, all that is ever measured in astronomy is number of photons as a function of position on a detector, which can be related to apparent magnitude for a conventional exposure or as the intensity of a spectrum in the case of spectroscopy. Everything else is interpretation. Nevertheless, it makes sense to say that one can directly measure redshift, magnitude, and angular size, and, one step less concrete, that one can measure λ 0 and Ω 0 via the derived parameters (assuming some framework, such as FRW). Despite some claims to the contrary, no cosmological test directly measures acceleration; this is calculated from the cosmological parameters obtained. Similarly, H(z) is a calculated quantity.
using supernova data, concluding that η > 0.42 (2σ). Adding the H(z) data used by Yu et al. (2011) of course improves the constraints, resulting in 0.66 ≤ η ≤ 1.0 (2σ) with the best fit at η = 1, a broadly similar result. Note that Helbig (2015a) also finds the best-fit value η = 1 if λ 0 and/or Ω 0 are constrained. Thus, while Yu et al. (2011) did indeed make a mistake, the fact that η ≈ 1 means that it didn't appreciably affect their main result.
While there is no evidence that our Universe is not well described by an FRW model, it is important to test for deviations from this assumption. One possibility is to test the Copernican Principle by looking for a redshift dependence of the curvature parameter ; another is to express Ω 0 in terms of observable quantities, resulting in an expression which must hold at all redshifts (Sahni et al. 2008;Zunkel & Clarkson 2008).  pointed out that these tests implicitly assume that the universe is assumed to be homogeneous and isotropic on all scales (in other words, the 'RW' is assumed; the idea is to test the 'F' part of FRW), and showed that using the ZKDR distance leads to false positives for these tests (i.e. the Copernican Principle appears to be violated when in fact it is not).  also rewrite the ZKDR equation so that η is given as a function of observable quantities, allowing one to reconstruct η(z) from observations for a general ΛCDM model. Such an η(z) can also mimic the behaviour of model with η = 1 but with w = −1, i.e. some form of dark energy other than a cosmological constant.
Inhomogeneous cosmological models definitely affect light propagation. Whether they affect the expansion rate of the universe is still debated. Green & Wald (2014) claimed that there is no evidence that an FRW model is not a good description of the Universe on essentially all scales (except perhaps the extremely small scales encountered in, for example, the m-z relation for Type Ia supernovae).

CLASSICAL COSMOLOGY: ANGULAR DIAMETERS
One of the basic cosmological tests is the 'standard rod' test, i.e. the comparison of the angular size as a function of redshift of an object of given size to the theoretical expection derived from the angular-size-redshift relation, which in turn depends on the the cosmological parameters. (By the same token, the calculation of the physical size from the observed angular size depends on the cosmological model, and on η.) Although a classic test, no useful constraints have been derived from it-except in the cases of the CMB and BAO, though here the corresponding physical lengths are so large that the ZKDR distance plays no role (e.g. Lewis & Challinor 2006)primarily because of the difficulty in finding a standard rod. Nevertheless, some progress can be made. For example, Alcaniz et al. (2004), assuming a Gaussian prior Ω 0 = 0.35 ± 0.07 in a flat universe, found the best fit at Ω 0 = 0.35 and η = 0.8 (consistent with the results mention in Sect. 27).
Araújo & Stoeger (2009) point out the interesting, long-known, but generally unappreciated fact that, for a flat universe, the redshift at which the maximum of the angular-size distance occurs is a direct measure of λ 0 , independently of H 0 . For a non-flat universe, knowledge of the redshift of the maximum and H 0 allows one to determine both Ω 0 and λ 0 . Note, however, that this depends on the assumption that η = 1.
Also, Chen & Ratra (2012) examined constraints from the angular sizes of galaxy clusters, both for general FRW models and for two classes of flat models with different types of dark energy. Their conclusion is still valid today: such constraints are approximately as restrictive as those based on gammay-ray-burst apparent-luminosity data, strong-gravitational-lensing measurements, or the age of the Universe, but less so than those from BAO or the m-z relation for Type Ia supernovae (or the CMB). Nevertheless, as an independent constraint, the fact that they are compatible with other data strengthens our confidence in the concordance model.

ANALYTIC APPROXIMATIONS
In general, analytic solutions of the ZKDR distance are very complicated. Moreover, there are analytic solutions only for special values of λ 0 , Ω 0 , or η. Wang (1999) presented an approximation for the ZKDR distance as a polynomial in η with coefficients which depend on redshift and the cosmological parameters, the latter via the fact that the coefficients depend on the distance calculated for given values of Ω 0 and λ 0 for η values of 0, 0.5, 1, and 1.5. Note that η = 1.5 is in conflict with the assumptions under which the ZKDR distance is derived; nevertheless, this can be valid from a heuristic point of view (e.g. Lima et al. 2014). Of course, η > 1 everywhere is impossible, but could be valid if η depends on the line of sight. In that case, however, one should think of it as an average along the line of sight, i.e. a particular line of sight might, by chance, have an aboveaverage amount of matter along it. If this were constant, it would imply an extremely long structure aligned with the line of sight, which would not be compatible with an approximate FRW model. Demianski et al. (2003) 36 found 'an approximate analytic solution. . . which is simple enough and sufficiently accurate to be useful in practical applications'. It is not clear how useful this is, though. It was apparently discovered more or less by accident and has no theoretical basis. As such, it is not clear a priori in which cases it is a good approximation, so one needs to test it against an (at least numerically) exact solution, in which case one might just as well use the better solution. 37 Also, the numerical implementation of KHS is, in most cases, only a factor of 3 or so slower than the ellipticintegral solution (of course, one can compare only in those cases where such solutions exist; the numerical implementation knows no special cases-and is valid for all values of the input parameters, using the same algorithm for all-and the speed depends only weakly on the input parameters), so there doesn't seem to be a real need for approximate solutions; even if such an approximation is faster than the elliptic-integral solution (and valid for all input parameters), the elliptic-integral solution is 'almost analytic' and reasonably fast, so a factor of 3 for a general and accurate numerical implementation is not a big disadvantage in practice. Though restricted to k = 0 and η = 1, similar remarks apply to the work of Pen (1999). 36 See also Demianski et al. (2000) which is the precursor, but longer and substantially different in places. 37 Lewis Carroll, in one of his less famous books, describes a map with a scale of 1:1, but it was easier to just use the real Earth than the map (Carroll 1893).

SUMMARY
The basis of observational cosmology is calculating the dependence of some observational quantity-usually related to some distance-on redshift for a variety of cosmological models, then determining the corresponding cosmological parameters via finding the model which gives the best fit to the data. Small-scale inhomogeneities can affect the relation between redshift and distance, thus it at least needs to be investigated whether results depend on the amount of inhomogeneity. Zel'dovich (1964b) introduced a simple model for such small-scale inhomogeneities and an analytic solution (for the Einstein-de Sitter model) for the extreme case, namely that light propagates through completely empty space, all of the matter being located in clumps outside the beam. Subsequent work generalized that model to other cosmological models and/or intermediate degrees of inhomogeneity (later known as the ZKDR distance, after the initials of the most influential pioneers), investigated a similar approach involving so-called Swiss-cheese models (not necessarily more realistic, but exact solutions of the Einstein equations) which were later shown to correspond to the Zel'dovich (1964b) model in a well defined way, investigated assumptions in the models and their effects (e.g. whether the clumps are transparent, if averaging whether the average is taken over the celestial sphere or over all lines of sight, etc.), compared the results of the models with exact solutions or numerical simulations, and developed approximations to various distance formulae. Approximations are no longer needed, now that computing power has increased and an efficient numerical implementation is available for the general case (Kayser et al. 1997).
Most of the theory was complete by the middle of the 1970s. The discovery of the first gravitational-lens system in 1979 revived interest in this topic: since gravitational lenses obviously require an inhomogeneous universe, in such cases the assusmption of a completely homogeneous universe with regard to light propagation becomes more obvious. Until the middle of the 1990s or so, effects of inhomogeneities were not that important in observational cosmology, for two reasons. First, the uncertainty in the cosmological parameters was large, comparable to (with respect to the effect on the distance as a function of redshift) variation in the inhomogeneity parameter η. Second, most observations were at low redshift, whereas η is a higher-order effect compared to the first-and second-order parameters H 0 and q 0 (see also equation (8) in Kantowski 1998). The use of Type Ia supernovae for the m-z relation extended observations to higher redshift. Also, both this test as well as others had constrained the cosmological parameters to a degree that the effect of η could no longer be ignored, which led to another revival of interest.
In general, the effect of η depends on angular scale: large angular scales correspond to a fair sample of the universe within the beam, while this is not necessarily the case for small angular scales. Since supernovae have an angular scale of about 10 −7 arcsec, which is very small, one would perhaps expect to see effects of η in the m-z relation for Type Ia supernovae. However, many independent investigations come to the conclusion that η ≈ 1, not just on average, as is to be expected, at least under certain assumptions (Weinberg 1976), but also for each individual line of sight. The reason for that is probably that the Zel'dovich (1964b) model is incorrect in the sense that it is not a good approximation for our Universe: no-one doubts that the ZKDR distance is correct in a universe with a mass distribution well modelled by that on which the idea of the ZKDR distance is based, but apparently that is not our Universe. In other words, most of the matter is not outside the beam, even for very narrow beams, but rather even such very narrow beams fairly sample the Universe. Note that η ≈ 1 does not necessarily imply that matter is distributed homogeneously within the beam; it just implies that the distance as calculated from redshift is approximately the same as if that were the case. In reality, such a beam will traverse voids with less than average density, but also regions (corresponding to the filaments and sheets of large-scale structure) with much higher than average density. Although this violates the assumptions on which the ZKDR distance is based, nevertheless in practice such a mass distribution results in distance as a function of redshift very close to the standard distance, i.e. that optained by assuming that the universe, at least with regard to light propagation, is completely homogeneous.