Ultra Fast Astronomy: Optimized Detection of Multimessenger Transients

Ultra Fast Astronomy is a new frontier becoming enabled by improved detector technology allowing discovery of optical transients on millisecond to nanosecond time scales. These may reveal counterparts of energetic processes such as fast radio bursts, gamma ray bursts, gravitational wave events, or play a role in the optical search for extraterrestrial intelligence (oSETI). We explore some example science cases and their optimization under constrained resources, basically how to distribute observations along the spectrum of short duration searches of many targets or long searches over fewer targets. As a demonstration of the method we present some analytic and some numerical optimizations, of both raw detections and science characterization such as an information matrix analysis of constraining a burst delay -- flash duration relation.


INTRODUCTION
The transient sky is a treasure trove (Wyatt et al. 2020) of events and information about the energetic universe, ranging over gamma ray to radio wavelengths in light, plus neutrinos and gravitational waves, and over time scales from milliseconds to months. Well known cosmological examples include fast radio bursts (FRB) and gamma ray bursts (GRB) on millisecond to second timescales, binary black hole inspiral gravitational waves over milliseconds to days, and supernovae over days to months. Of course the entire universe can be viewed as transient over long enough timescales (Stebbins 2012;Erskine et al. 2019;Kim et al. 2015;Marcori et al. 2015;Bel & Marinoni 2018;Quercellini et al. 2012).
Time domain surveys specifically seek to explore the transient sky, with the next generation coming underway (Bellm et al. 2019;Mandelbaum et al. 2018). Continuously scanning surveys can also detect transient events (Naess et al. 2021;Abazajian et al. 2019;Slosar et al. 2019;Whitehorn et al. 2016). Multimessenger detection of an event in photons of diverse wavelengths, other high energy particles such as neutrinos or cosmic rays, or gravitational waves is a burgeoning new field, e.g. Kalogera Kasliwal et al. (2019). However, all these surveys tend to scan over seconds to days, insensitive to transients on shorter timescales (but see Arimatsu et al. (2021); Yang et al. (2019); Shearer et al. (2010)). Certainly we know that astrophysical energetic events can occur down to nanosecond timescales, e.g. Philippov et al. (2019); Stebbins & Yoo (2015). If optical and ultraviolet, infrared, etc. surveys sensitive to subsecond timescales can be realized, might they reveal a whole new world of events at cosmological distances?
Detector technology and imaging software is reaching capabilities to make such surveys a reality. In particular, the development of silicon photomultiplier arrays (SiPM) with ultrafast readout and coincidence triggers (Lau et al. 2020;Li et al. 2019b) may open up the era of Ultra Fast Astronomy, on millisecond and even sub-microsecond timescales in the optical and near infrared. Other detectors such as avalanche photodiodes (Li et al. 2019a) are also being explored. While they still have a long way to go in terms of large arrays, spatial resolution, and noise including crosstalk, they are capable of single photon detection and continuous readout.
The discovery of optical millisecond and below transients would be a major breakthrough, rife with astrophysical information and numerous applications. Here we focus on one particular aspect of this new field of Ultra Fast Astronomy: transients visible in multiple windows, e.g. gamma rays and optical light, or gravitational waves and optical light, that can be targeted, i.e. there is a precursor or repeating event. That is, we specifically seek connections with known -or yet to be found -ultra fast transients. Examples include the search for optical counterparts of known energetic events, such as repeating fast radio bursts (FRB), delayed optical phenomena from a one time event, or even as part of an optical search for extraterrestrial intelligence (oSETI) program (Wright et al. 2018;Kipping & Teachey 2017;Davenport 2019).
The size of detectors arrays, the possible field of view, and the instrumental characteristics are yet unknown. What we aim for here is merely to give a flavor of the science investigations and methods that might prove useful. Of necessity these will need to be adjusted for specific situations, e.g. particular types of bursts to follow, instrument properties, etc. However the general optimization framework we present should be a good guiding tool, regardless of the specific burst properties used as toy examples.
Framed generally: given a set of adaptable observations, a merit function to optimize, a limited resource, and a cost function per observation, what is the best strategy for achieving some science characterization. Our main topic of investigation will be ways of optimizing a detection or characterization of an astrophysical process given a large set of targets (e.g. prior burst locations) but limited observing resources -e.g. how much time should be spent on each target to best achieve the pro-posed goal. We emphasize that this is in essence a follow up program; we assume a target list of interesting objects, e.g. repeating FRB, exists and we seek detection of optical transients from them.
Astrophysically, such an optimization analysis has been carried out for strong gravitational lens surveys (Linder 2015) and supernova surveys (e.g. Huterer & Turner (2001);Frieman et al. (2003)). For example there the optimization was for the data set distribution, the number of targets followed up at different redshifts, while the merit function was the dark energy joint parameter estimation uncertainty ("figure of merit"), the limited resource was the total spectroscopy time, and the cost function expressed the resource use of followup spectroscopic time by a target at redshift z. There may also be a systematics function that effectively caps the number of useful targets in each observation (e.g. redshift) bin.
To develop and demonstrate the method we explore three separate science approaches. We emphasize that these are only toy examples; applications of Ultra Fast Astronomy are likely to be much richer and more varied, and of course instrumental specifics will play a much larger role. In Section 2 we describe the basic foundation, in terms of a target source (known for repeating or predictable bursts) and its possible optical counterparts -here called flashes. Section 3 demonstrates the optimization procedure on the question of how to allocate limited observing time if one wants to use flash abundances to connect flash properties to burst characteristics. A more physically incisive test of a property such as a burst delay -flash duration relation, possibly useful as a test of oSETI, is optimized in Section 4 when using measurements of flash durations rather than mere abundances. Such examples should give the flavor, and potential exciting promise, of the general approach. We discuss further potential opportunities from technology development and conclude in Section 5. In the Appendix A we present some analytic results on maximizing detections by considering basic two or three time bin optical followup programs of repeating bursts, just using this simple example to illustrate some general principles of resource constrained optimization.

BURST OPTICAL COUNTERPARTS
The basic idea is that there is a list of targets for which we seek to detect optical transients. In particular, we consider the targets to be sources that burst in some other wavelength, and we look for optical counterparts. One example might be fast radio bursts we follow up with an optical program to look for ultra fast optical transients. This phrasing is purely a matter of convenience and indication of an interesting science goal, without drawing on specific FRB properties -the principles are kept generally applicable. Fast radio bursts are millisecond transients detected at radio wavelengths (Petroff et al. 2019). A fraction of them repeat, though not necessarily periodically. They may, however, have "active windows" that are periodic, with enhanced probability of a repeat burst sometime within the window; this has also been detected for soft gamma repeaters (Denissenya et al. 2021).
The optical program is thus essentially a followup program, scanning targets that have previously burst, not a blind survey. We distinguish the transients by saying that we look for flashes (e.g. in optical) that are targeted on previous bursts (e.g. in radio) -and in particularly targeted at a time when we have some reason to expect a repeat burst so that we can look for coincident, or nearly coincident, flashes and bursts. (Also, active windows for bursts do not always deliver actual activity, but it is possible there may be activity of interest at that time in other wavelengths.) Determining whether FRB, or any other bursting sources, have optical emission, and measuring it, would be important for understanding the physics behind the source; so far the millisecond timescale is too short to make such optical (or UV or NIR) observations. As initial goals of Ultra Fast Astronomy we might seek to maximize the chances of detecting an optical counterpart, i.e. the numbers of such burst counterparts, and to learn some basic physics behind the flash. We discuss these in Section 3 and Section 4 respectively.
The first step to investigate the relation between flashes and bursts is take a probability for a burst within an observing window and propagate it to the number of flashes over a distribution of observing times, i.e. a survey. Suppose the probability distribution function (PDF) of a burst repeating a time t after a previous burst (so we know where to look) and having an optical counterpart (so there is something to detect) is some normalized p burst (t). Then in a window of time [t i−1 , t i ] the probability of a viable repeat burst is (1) This may be a function of burst properties, and we may later be interested in subtypes, but for now we simply consider any burst. Our basic question is whether we want a quick look at many targets or a long look at fewer targets -what is the optimum allocation? The observing procedure is taken to be assignment from a target list to observing times of various durations. For example, some targets we will observe for a duration t 1 and if no flash is detected we move on to another target. Some targets we will observe for t 2 -if it does not flash by time t 2 we move on, if it flashes in the first t 1 time then we do not continue to observe it afterward, and instead assign it to the t 1 "bin", and if it flashes between times t 1 and t 2 then it is in the bin corresponding to t 2 , etc. We consider an ordered set of discrete times t i , where t i > t i−1 and t 0 = 0. This is both for mathematical simplicity and because of the possibility that instrumental constraints may impose certain intervals. One could redo the calculations with finer divisions; this does not affect the overall principle. The target list should be viewed as locations of interest; we do not know exactly when the next event will occur, certainly not on the subsecond time scale of the transients. Unlike a rolling search for supernova explosions, say, where we target a set of galaxies, while supernovae will stay visible for a month or more, bursts go undetected if we are not already observing their location.
The number of flashes detected is where n i ≡ N target (t i ) is the number of targets with observation duration t i (i.e. producing, or not, a flash between t i−1 and t i ). Again the question we focus on is what is the optimum allocation n i (t i ) -do we want a quick look at many targets or a long look at fewer targets? (The following calculations assume a single burst population; at the end of Section 4 we explore multiple source populations, corresponding to an extra summation index in Eq. 2 for example.) A further real world complication is limited resources, e.g. telescope time, meaning that we cannot observe as many targets as we want for as long as we want. Each observation comes with a cost, taking up part of finite resources R. The cost function may be simply proportional to observing time, i.e. how much time we dedicate to searching for a flash from a target, so Here we take all times, and R, to be in units of t 1 , the shortest observation time. That may be one second, one day, or whatever; that will depend on the specific target population and instrument, but we leave that for future detailed survey design. Given detailed instrument and survey design, the observing procedure can be further adjusted by employing Monte Carlo methods taking into account, e.g., telescope slew and readout times (see the discussion on such issues at the end of Section 4). We also emphasize that at this stage we are not aiming for a complete survey, detecting all counterparts, but rather a reasonable attempt at an initial set of counterpart flashes.

DELAY-DURATION RELATION -VIA ABUNDANCE
While detecting ultra fast optical transient counterparts would be exciting and significant, it would be even better to extract some physics out of the detections. We consider the following as one illustration. Suppose the optical flash durations τ are connected with the burst repeat delay time t p through a power law scaling. We write where t ⋆ is a fixed pivot scale, A is the dimensionless amplitude of the relation (with a t ⋆ out front for correct dimensionality) and s is the slope relating the mean burst delays to the flash durations. This could be relevant to the astrophysical bursting mechanism, e.g. of FRB or GRB, and also to oSETI, where a civilization might choose to broadcast short messages frequently, or long messages infrequently, in a sort of constant energy output (s = 1) schema. We would like to test such an assumption, e.g. is s = 0 so there is a relation between the burst delay t p and the flash duration τ , and characterize it -what estimates of amplitude A and slope s do we obtain from data. We will use an information matrix approach to determining the physical parameters A and s from the flash detections, and then optimizing the observation time distribution to get the best constraints. In this section we take the observable to be the number of flashes in a certain observation time bin t i , summed over the flash durations τ . We want to relate these measured abundances to the underlying process parameters of the delay-duration amplitude A and slope s. (In Section 4 we take a more direct approach, using the flash durations themselves. ) We begin with the number of flashes with duration τ detected from all bursts observed with observation time duration in bin i: t i−1 < t obs < t i . A subscript i means such a time bin and we take all bins to be of equal, unit width. Then where we used p flash (t i ; τ )dτ = p flash (t i ; t p )dt p . The quantity p flash (t i ; t p ) is taken to be the probability of the repeat burst within the observing time, i.e. Eq. (1), times some probability for having a flash associated with a burst, p ⋆ . We set p ⋆ = 1 for simplicity; a constant p ⋆ does not affect the form of our results. We consider a Gaussian delay model where This represents a burst that occurs a mean time t p after time zero (e.g. the previous burst or the beginning of an activity window), but with some uncertainty σ. While this is just a model, it is a reasonable first choice; one would need to determine the burst delay distribution in the process of building up the target list. The probability for a flash within a window [t i−1 , t i ], and hence being assigned to observation duration t i , is then This gives all the necessary ingredients for the calculation. Using the inverse of Eq. (4), we can calculate the Jacobian Explicitly, To carry out the information analysis, we need the sensitivities: the derivatives with respect to the parameters, Note that for θ = {A, s} and p i (t p ) ≡ p flash (t i ; t p ). The derivatives and for the Jacobian factor Putting it all together, and where the derivatives are evaluated at t p,fid = t ⋆ (τ /At ⋆ ) 1/s . For p i (t p ) given by Eq. (8), its derivative is a difference between Gaussians; note the 1/p i (t p ) term cancels the p i (t p ) in N flash .
We also need to specify the abundance measurement noise matrix, e.g. Poisson measurement error on the abundances so that C −1 = N flash (t i , τ ), and the fiducial values, e.g. s = 1 and A = 1/4, with a pivot scale t ⋆ = 40. The information matrix is summed over the bins in τ -with the constraint that τ < t i (we can't measure a duration longer than we have observed for) -and then summed over bins in observing time t i . Thus, Figure 1 shows N flash (t i , τ ) and the sensitivity derivatives ∂ ln N flash /∂θ for our parameters, θ = A and s. We see the sensitivity curves vs the flash duration τ have very different shapes, implying little covariance is expected in their determination. Both the delay-duration amplitude A and slope s should be well determined if we observe a range of durations. Figure 2 illustrates the information analysis results for the parameter uncertainties and the figure of merit (det F ) 1/2 related to the inverse area of the amplitudeslope confidence contour. We see that the slope is determined at about the same level for t obs 10 while the amplitude is best determined near the pivot scale t ⋆ = 40. Due to the changing covariance between the parameters, however, the figure of merit (FOM) ends up being monotonic, gaining more information with a longer observation time. Figure 3 shows the joint 1σ confidence contours for three different values of t obs . As expected from Fig. 2, the estimation of the slope s changes little, but the probability ellipse rotates to give a smaller range for amplitude A near t i = 40, thinning the ellipse as t i increases, giving a smaller area and larger FOM.
For the full information analysis, one sums over the whole range of observing time bins t i , weighted by the number of targets in that bin, n i . This will of course reduce the uncertainties on the parameter estimation.  However our main aim is to determine the optimum observing strategy, i.e. the optimum distribution n i , under our observing time resource constraint and cost model.
We use two independent codes to carry out the optimization. One is the information analysis optimization code described in Linder (2015), adapted for the present observables and variables, that evaluates the change in merit with resource amount (observing time), bin by bin, selects the bin with weakest leverage, reduces that n i , and reallocates its time to other bins (hence changing n j ) in a fashion weighted by the merit per resource used. The other uses the interior point optimization algorithm which combines the resource constraint and the merit function using the barrier function approach (Weisstein 2021). They provide valuable crosschecks on results and convergence, and we find excellent agreement between them.
The results show this optimization favors maximum numbers in the lowest observation time bin, due to the resource constraint. While FOM improves with t obs , lower t obs allows many more targets within the given resource constraint. Since the information matrix basically goes as n 3 i (two factors from the two observable N flash proportional to n i and one from the Poisson noise model), putting all observations in the lowest t obs bin wins. Of course there may be other constraints limiting the number of targets, such as from available repeaters, the telescope, detectors, etc. In that case, the optimum found fills up the lowest bin to the maximum allowed, proceeds to the next lowest bin, and so forth until the resource allotment runs out.
A short analytic proof of the behavior comes from considering the FOM if all observing is in bin p vs q > p. In the first case, FOM 1 = cn 3 p p, where from Fig. 2 we approximate the dependence of FOM on t obs as roughly linear. In the second case FOM 2 = cn 3 q q, but the resource constraint imposes that n q = n p p/q. Therefore we have FOM 2 = FOM 1 (p/q) 2 < FOM 1 since p < q. Thus the lower bin always wins for this model with the abundance as the main observable.

MEASUREMENT
The optimization in the previous section led to an all or nothing result: the solution involved flitting from target to target as fast as possible to observe as many as one could within the resource constraint. This was driven by the strong dependence of the information matrix on the number of events, due to taking the abundance as the central observable. A more subtle, and interesting, result comes if the observable is the flash duration itself, for deriving the relation to the burst delay. We analyze this case here.
The information matrix now becomes where C is the duration measurement noise matrix. The sensitivities are the Jacobian |dt p /dτ | is given by Eq. (10), and for C we take a diagonal matrix with elements C = σ 2 τ δ qq where q is the index for the τ bin and The statistical contribution may depend on τ , though here we take a fiducial case σ stat = 1; the systematic term imposes a floor on the measurement uncertainty, or ceiling on the number of events such that increasing their quantity does not significantly add to the information beyond some limit (i.e. it cancels the N flash (t i , τ ) in the numerator of F jk : effectively, instead of a measurement uncertainty σ stat / √ N flash one has σ sys ). We investigate two fiducials, σ sys = 0 or 1, as the simplest possible examples. Real systematics would be determined by the experiment design, involving instrumentation properties related to readout, crosstalk, etc. Ultra Fast Astronomy instrumentation is not yet sufficiently advanced to yield an estimate of such systematics so we use our naive model to illustrate how it enters the optimization method we present.
We start the optimization from an initial state of uniform n i across all bins t i = [1, 100], giving a fixed resource constraint of R = 5050. Of course the survey becomes less sensitive to sources with delay times near or beyond the limit, but we want to maintain a broad survey. First we take a purely statistical uncertainty, with σ stat = 1. Parameter estimation results will scale linearly with this, and as the reciprocal square root of the total resources available; however it will not affect the form of the distribution optimization for n i (for σ stat independent of τ ). The final converged optimization gives two peaks in the n i target distribution, at t i ≈ 6 and 100, as seen in Fig. 4. Note Huterer & Turner (2001) also found that for a system with P parameters the distribution is optimized with P delta functions. The numerical solution approaches isolated peaks, with earlier iterations having slightly broader peaks, but these possess very close to the final parameter uncertainties and FOM so we see that small variations in the distribution The solid histograms are for a purely statistical measurement uncertainty of σstat = 1. Distribution results n i are shown for three different stages in the optimization process approaching the final convergence having sharp peaks at t i = 6 and t i = 100. Parameter constraints generally have smaller deviations from their final values than that listed for FOM. The dotted brown curve shows the impact of turning on a systematic, with σsys = 1, where the optimized histogram is plotted at ten times its true height (i.e. really n i < 6).
(e.g. due to observational requirements) would have only a minor impact on the science results.
A systematic measurement uncertainty on τ , given by σ sys , will effectively place a ceiling on the number of productive targets in a given observing bin t i . This flattens the distribution peaks, broadening them and distributing the targets over many more bins. Including the systematics in quadrature, with σ sys = 1, worsens the parameter estimation by 22% on A and 44% on s, and by 33% on FOM.
We study the sensitivity of the FOM to the fiducial parameters in Fig. 5. Our fiducial value for the slope, s = 1, is seen to be the most conservative choice: other values of s give a higher FOM and tighter parameter estimation. This makes sense in that, broadly, when s = 1 some parameter dependence does not enter, e.g. in Eq. (10) and hence N flash . More specifically, for smaller s we also amplify the sensitivity ∂τ /∂s and increase the Jacobian dt p /dτ . Thus we expect small s to show the greatest improvement in parameter constraints and FOM. What we find for s > 1 is that, while the parameter constraints weaken, the FOM increases since there is less covariance between parameters -this is due to the enhanced range of τ contributing to a given favored t p near t i from the 1/s exponent in the relation Eq. (9).
For the amplitude A, the main effect is that larger τ is preferred for a given t p near t i . Since the sensitivities strengthen with increased τ , and the information goes as the product of sensitivity factors, this overcomes the reduced dt p /dτ . In fact, since ∂τ /∂s is the only sensitivity benefiting (with the increase in A and τ canceling in ∂τ /∂A), the improvement in the estimation of s and in FOM is roughly linear in A, while the estimation of A weakens slightly. (Only FOM is shown in Fig. 5.) Increasing the systematic σ sys of course decreases FOM and worsens the parameter estimation. Even within the naive model one can get a sense, at least qualitatively, for how systematics can impact the results. The uncertainty in the burst wait time, σ, has a minor effect, since it only enters into the probability factor and not the sensitivities. Decreasing σ makes it a little harder for t i to match t p , but we can still find some t i matching t p , where the probability is maximal. We have verified that varying σ continues to produce a sharp peak at t i = 7 or slightly earlier while preserving another peak at t i = 100. Figure 6 shows how the parameter constraints evolve as we change the fiducial slope s, as well as the impact on the FOM (inverse area of the confidence contour). In particular, we can see how for s = 1.5, although the individual parameter constraints are weaker, the FOM is higher due to the narrowing of the joint contour. As the fiducial s changes, the covariance between A and s does as well, rotating the contour and squeezing or broadening it. The slope and amplitude of the burst delayflash duration relation can be determined at the percent level, within this model, for the resources assumed. This gives the flavor of how such a result could be promising for testing, e.g., an oSETI hypothesis of constant energy output, i.e. τ /t p = constant, or s = 1. Since this plot is with statistical uncertainties only, the constraints scale with resources: linearly for FOM and as the inverse square root for the parameter estimation uncertainties.
Finally, suppose we have multiple source populations with different properties. While the sum over flash duration τ in Eq. (20) does sum over a range of burst dura- tions by Eq. (9), different populations may have different relations. We therefore now consider two populations, with relations having characteristics t ⋆ = 40 or t ⋆ = 20, and determine how the optimization changes. Figure 7 varies the t ⋆ = 20 population from 0% of the total (as used in the previous calculations) to 100%. We see this has very little impact on the observation duration optimization: the preferred survey still has a two peak distribution of times, one at the maximum duration and one at much shorter (but not minimum) duration. The lower peak does shift a little from t obs = 6 to 7, depending on the population fraction, but we saw in Fig. 4 that the figure of merit is fairly forgiving of slight deviations from the formal optimum. Figure 8 shows that the constraints on the parameters of the flash duration -burst delay relation do change somewhat with source population fraction. The amplitude uncertainty σ(A) shows a 62% variation from fraction 0 to 1, while the slope uncertainty σ(s) only varies by 6%; the covariance between the two is important and compensates such that the FOM only varies by 16%.
A technical question we have not addressed concerns scheduling. How should one decide which targets to observe at what time, especially as observing one target may mean missing a favored time for another target? To the extent that all targets are equal, for example during the first exploratory survey when it is not known if any have optical counterparts to follow up, then a random choice as used here is fine, or one might focus on targets at the best airmass for better signal to noise, or close together for reduced telescope slew time. When UFA detectors advance to the state when large arrays can cover a wide field, then several targets can be followed simultaneously. As mentioned in Sec. 2, we have not aimed for a complete survey but one with a promising and interesting set of initial detections. Other considerations  may be a desire for a redshift distribution or concentration on, say, galaxy clusters. These are all important questions and as we learn more about the source populations and instrument characteristics the survey planning would extend to these issues.

CONCLUSIONS
Ultra Fast Astronomy has the potential to uncover a new view of the universe, unknown to surveys scanning celestial sources on time scales of seconds or longer. Technology such as arrays of photomultipliers and photodiodes is approaching the capability of millisecond, microsecond, or even nanosecond resolution in the optical and possibly NIR time domain.
This may reveal not only new classes of transients, but connections between transients across wavelengths and multimessenger signals. Such detections could provide critical clues to the physical origins and mechanisms behind highly energetic events -or even hints in the search for extraterrestrial intelligence.
We present a method and explore three cases of optimizing observations so as to maximize science under constrained resources, such as telescope time. We investigate science characterization from counterpart (flash) detection, such as testing a burst delay -flash duration relation. First we analyze optimizations in terms of measured abundances, and then using measured flash durations. The study includes interpretation of the parameter estimation constraints and figure of merit (inverse area of the joint confidence contour), scaling with intrinsic and survey parameter values, and role of statistical and systematic uncertainties. In the Appendix, we look at optimizing the number of counterpart detections, and solve analytically a toy model of how to distribute search durations before moving to another target, just using this simple example to illustrate some general principles of resource constrained optimization.
Under constrained resources, the derived optimum for the delay-duration model studied in the most physically incisive case, Sec. 4, is a two prong observational strategy of many targets observed for a short (but not minimal) time and a few targets observed for a (maximally) long time. This continues to hold when including multiple source populations, at least for the simple two population case we calculated. Resulting parameter estimation can reach the percent level. For example, a "constant energy output" schema of extraterrestrial signaling could be measured at signal to noise S/N ≈ 100 for our fiducial resource level.
These are examples of survey strategy and science analysis that could be useful for Ultra Fast Astronomy. They are meant to be illustrative, and hopefully inspirational, of methods and potential science goals, not (yet) rigorous or comprehensive. Throughout, we assume an established target list of burst sources (e.g. in the radio or X-ray) that we want to scan for optical flashes. The specific optimization depends on having guidance on the burst repeat time probability distribution function, e.g. whether it favors rapid repeats, long term ones, or ones tending to occur in some characteristic window. Thus the Ultra Fast Astronomy survey would be matched to the burst populations being followed up. One could alternately of course carry out a blind scan of the sky or target interesting nontransient or nonrepeating sources; all these types of surveys will no doubt play a role in Ultra Fast Astronomy. Another important aspect of future work that will go hand in hand with the technology development is connecting the measurement uncertainty model more closely with promising detector technology characteristics. This is a new frontier, and there are many directions to explore.
We thank Albert Wai Kit Lau for helpful discussions. This work is supported in part by the Energetic Cosmos Laboratory. EL is supported in part by the U.S. Department of Energy, Office of Science, Office of High Energy Physics, under contract no. DE-AC02-05CH11231.

ANALYTIC CASES
Here we consider a much more restricted question than in the main text, but one useful as an analytic example. If our main focus is purely detecting a maximum number of flashes, without dealing with their properties, then we seek to maximize N flash . Combining Eqs. (2) and (3), we have To maximize N flash with respect to the distribution n i , we evaluate the partial derivative, giving a critical condition This is readily understandable as the boundary determining the sign of the term with the minus sign prefactor in Eq. (A1), thereby increasing or decreasing N flash . There are three cases. One is that this has no solution because while by construction t i > t 1 , we may have p i < p 1 , depending on the form of p burst (t). In this case the quantity in parentheses in Eq. (A1) is positive and so the maximum N flash is when the sum goes to zero. That is, the optimum (in terms of many detections, not the science property characterization of the main text) is to only observe on the shortest timescale, t 1 , with n i>1 = 0. A second case is when p i > p 1 and t i /t 1 < p i /p 1 . Then the subtracted sum gives a positive contribution and time bins beyond the shortest are useful. The third case is when t i /t 1 = p i /p 1 , and this is interesting because it both fixes the time bins t i and leaves a degeneracy where a family of n i gives the same optimum. Thus we see that the observing strategy for this narrow focus on simply detections depends on the astrophysics of p burst (t). To explore the range of behaviors we consider three different forms for p burst (t) -late activity, early activity, and peaked at some intermediate time. These are purely illustrative, but cover the major classes.