Experimental sciences have traditionally been a source of challenging mathematical problems with a double edge: while mathematical theories are created, technology moves fast and industry develops. Imaging sciences provide a remarkable example. Typical imaging systems, such as radar , magnetic resonance tomography, ultrasound, echography , and seismic imaging , pose inverse scattering problems with a similar mathematical structure. In all of them, waves generated by a set of emitters interact with a medium under study and the wave field resulting from the interaction is recorded at a set of receivers . Different imaging systems resort to different types of waves and arrange emitters and receivers according to varied geometries. The nature of the employed waves depends on factors such as the size of the specimens under study, the contrast between components, and the damage caused to the sample during the imaging procedure. Knowing the emitted and recorded waves, we aim to infer the structure of the medium.
Approximating the solutions of inverse scattering problems is a challenging task because such problems are severely ill posed . Given arbitrary data, the problem under study may not admit a solution, the solution may not be unique, or it may not depend continuously on the given data. This means that small errors may lead to a solution different from the searched one. In view of the relevant technological applications in a host of fields, such as medicine, security, geophysics, or materials testing, to mention a few, there is a need of even better mathematical techniques for classical imaging problems, as well as a need of new ideas to tackle new imaging set-ups.
We focus here on recent developments in digital holography, summarizing work done during the past 10 years in collaboration with experimentalists designing holographic microscopes. This collaboration started in 2012 thanks to the interdisciplinary communication environment created at the Harvard University’s Kavli Institute seminars. Since then, we have developed analytical and computational tools to handle inverse problems arising in digital holography, in collaboration with researchers from Harvard University and Tesla, Universidad Complutense de Madrid, Universidad Politécnica de Madrid, Universidad de Oviedo, Université de Technologie de Compiègne, and New York University.
Digital in-line holography is a noninvasive tool for accelerated three-dimensional imaging of soft matter and live cells [23, 16, 26, 37] that achieves high spatial (nanometers) and temporal (microseconds) resolution without the need of toxic fluorescent markers or stains. In this context, a hologram is a two-dimensional light interference pattern encoding information about the optical and geometrical properties of a set of objects . Shining a properly chosen light beam back through the hologram we can recreate the original three-dimensional image. Instead, digital holography is designed to produce numerical reconstructions of the objects in an automatic way, which amounts to solving computationally an inverse scattering problem. We will show next that optimization schemes with partial differential equation constraints, analysis of the topological derivative of objective functions, regularized Gauss–Newton iterations, and Bayesian inference are effective tools to invert holographic data in the presence of noise while quantifying uncertainty.
2 The forward problem
The forward problem is a mathematical model of how a hologram is generated. Figure 1 illustrates how an in-line hologram is formed, though more complicated set-ups are possible. First, a laser light beam interacts with a sample. Then, interference of the scattered light field with the undiffracted beam generates the hologram on a detector screen past the object . The light wave field obeys the Maxwell equations. Typically, the emitted laser beams are time harmonic, that is, . The resulting wave field is also time harmonic, namely, , with complex amplitude governed by the stationary Maxwell equations. The resulting forward problem is
where , and are the permeabilities, permittivities and wavenumbers of the imaged objects , while , and correspond to the ambient medium  and are known. In biomedical applications, , being the vacuum permeability. The upper signs and represent limit values from inside and outside , respectively, and denotes the outer unit normal vector. Incident waves are polarized in a direction orthogonal to the direction of propagation , that is, , where stands for the magnitude of the incident field.
For any smooth region and any real , system (1) has a unique solution  in the Sobolev space that is continuous in (see ). For collections of spheres and piecewise-constant , one can calculate Mie series solutions . Starshaped object parametrizations with piecewise-constant allow for fast spectral solvers [20, 24]. Coupled BEM/FEM formulations [29, 31] are convenient for more general parametrizations, while discrete dipole approximations [36, 38] solve the problem avoiding the use of parametrizations.
In principle, the hologram is obtained evaluating the solution of the forward problem (1) at detectors placed on the screen: . In practice, the measured holograms are corrupted by noise.
3 Deterministic inverse problem
Given a hologram measured at screen points , , the inverse holography problem seeks objects and functions such that
where is the solution of the forward problem (1) with an object and the wavenumber (see ). Since the measured data are not exact, in practice one seeks shapes and functions for which the error between the recorded hologram and the synthetic hologram that would be generated solving (1) for the proposed objects and wavenumbers is as small as possible.
3.1 Constrained optimization
We recast the inverse problem as an optimization problem with a partial differential equation constraint: find and minimizing the cost functional
where and is the solution of (1). Here and are the design variables and the stationary Maxwell system (1) is the constraint. For exact data, the true objects would be a global minimum at which the functional (2) vanishes. In general, spurious local minima may arise.
3.2 Topological derivative based approximations
A topological study of the shape functional (2) for fixed provides first guesses of the imaged objects without a priori information on them. The topological derivative of a shape functional  quantifies its sensitivity to removing and including points in an object. Given a point in a region , we have the expansion
for any ball centered at with radius . The factor is the topological derivative of the functional at (see ). If is negative, for small. We expect the cost functional to decrease by forming objects with points below a large enough negative threshold [9, 14, 27]:
When , and , asymptotic expansions yield the formula 
with denoting the outgoing Green function of the Helmholtz equation . Once is constructed, we fit aparametrized contour to its boundary. Starshaped parametrizations are typical choices. Figure 2 exemplifies the procedure. The method is robust to noise, in the sense that perturbations of the data with random 10 % or 20 % noise, for instance, produce similar results. Notice that the value of enters through a factor that we may scale out in (5) and it is not really needed to localize the object. Similar results are obtained using the topological energy 
which does not involve at all. No knowledge of is needed to construct a first guess of the objects.
3.3 Regularized Gauss–Newton iterations
Fast methods to improve our knowledge of the objects starting from an initial guess are based on the following result. Let us consider two Hilbert spaces , and a Fréchet differentiable operator . Assuming that the exact data are attainable (that is, there is such that ), but only noisy data verifying are accessible, the iteratively regularized Gauss–Newton (IRGN) method  constructs a sequence as follows. We linearize the equation at at each step, approximate the solution of through the minimization problem
and set . The Tikhonov term has regularizing properties and promotes convergence for specific choices and (see ). The theory of linear Tikhonov regularization guarantees that
where denotes the adjoint of the Fréchet derivative . The noise level affects the stopping criterion, the so-called discrepancy principle.
In a holography set-up, the map is the operator that to each parametrization of objects assigns the synthetic hologram generated by solving the forward problem for those objects. Starshaped parametrizations are a standard choice for simple objects. They describe each object by a few parameters: its center and a radius function represented by a finite combination of spherical harmonics [5, 20]. Given a starshaped parametrization and a recorded hologram with a level of noise , the IRGN method first solves the linearized equation
by addressing the nonlinear least squares problem
where , , is an adequate Sobolev space , and then sets . The initial parametrization represents the first guess of the objects constructed by topological methods. The updated objects correspond to the parametrizations . The stopping criterion for the noise level is as follows. If the synthetic hologram calculated numerically for the current approximation of the objects satisfies
we stop the algorithm, being a parameter adjusted to guarantee a reasonable approximation while preventing early stops.
Figures 3 and 4 illustrate the process. Figure 4 depicts the hologram generated by the configuration with three objects shown in Figure 3 (a). We use the topological derivative (5) to spot a first dominant object at the top and locate an object there, see panel (b). Then we apply the IRGN method, see panels (c) and (d). At step 4 the cost functional, depicted in panel (j), stagnates without fulfilling the stopping criteria. This suggests that more objects should be created. This can be done by hybrid methods, as we explain next.
3.4 Topologically informed IRGN methods
Approaches that use initial object parametrization as reference have a drawback: the initial guess of the number of objects may be wrong. To overcome it, we have developed hybrid algorithms  combining topological derivatives and regularized Gauss–Newton iterations . We fit an initial parametrization to the first guess of the objects constructed by topological methods. Then, we apply the IRGN method and check that the cost (2) decreases. When the cost stagnates without fulfilling the stopping criteria, we reset equal to the current guess of the objects for the last parametrization obtained and calculate the topological derivative of the cost for . This is given by (3) if and its equivalent
when , with forward and conjugate adjoint fields satisfying transmission Maxwell problems with object :
where is the unit outer normal, and are Dirac masses concentrated at the detectors , .
We create a new approximation from by removing the points in at which the topological derivate surpasses a positive threshold and adding the points outside at which the topological derivate falls below a negative threshold , see [9, 6]:
The constants , are selected to ensure a decrease in the cost functional (2) keeping fixed. Once is constructed, we fit a parametrization to its contour and restart the IRGN procedure for . The procedure stops when the changes in the cost and the parametrizations fall below selected thresholds.
Let us revisit the example studied in Figures 3 and 4. At step 4 of the IRGN method the cost stagnates without fulfilling the stopping criteria. We calculate the topological derivative (6) of the cost for the current approximation of the objects, illustrated in Figure 3 (e). A new region where the topological derivative attains large negative values appears. We create a new object there and update the parametrization, see panel (f). Then we apply the IRGN method again. Since the cost functional still stagnates without fulfilling the stopping criteria, we recalculate the topological derivative (6) for the available object approximation. Panel (g) suggests the creation of a third object. We update the IRGN method using this new configuration, and evolve the resulting object configuration, represented in panel (h), until the stopping criterion is met at panel (i) after 24 steps. Panel (j) illustrates stagnation and decrease of the cost as new objects are added to the parametrization using topological information and the updated IRGN method evolves, in a logarithmic scale. These simulations assume known and fixed. Once first guesses for are available, we can implement this procedure considering constant values for at each component of the parametrization. Obtaining first guesses for that are reliable enough is a hard task  and the optimization procedure can encounter difficulties. Bayesian approaches provide alternative procedures that can handle these difficulties while quantifying uncertainty associated to noise and missing information.
4 Bayesian inverse problem
Bayesian formulations consider all unknowns in the inverse problem as random variables. Given a recorded hologram , we seek a finite-dimensional vector of parameters characterizing the imaged objects. When we assume the presence of objects, is formed by blocks, one per object. Using Bayes’ formula [22, 33]
where represents the prior probability of the variables, which incorporates our previous knowledge on them, while is the conditional probability or likelihood of observing given . The solution of the Bayesian inverse problem is the posterior probability of the parameters given the data. Sampling the posterior distribution, we obtain statistical information on the most likely values of the object parameters with quantified uncertainty.
4.1 Likelihood choice
Assuming additive Gaussian measurement noise, the measured hologram and the synthetic hologram obtained for the true object parameters are related by , where the measurement noise is distributed as a multivariate Gaussian with zero mean and covariance matrix . A possible choice for the likelihood is 
4.2 Topological priors
A typical choice for the prior distribution is a multivariate Gaussian
if is “admissible”, and when is “not admissible”, that is, it does not satisfy known constraints on the parameter set, see  for details. Here, is the covariance matrix and is the total number of parameters characterizing the objects. The mean is typically a set of parameter values characterizing an initial guess of the objects. Sharp priors are obtained fitting parametrizations to first guesses of the objects obtained from the study of topological fields associated to deterministic shape costs, as explained in Section 3.2.
4.3 Markov chain Monte Carlo sampling
when is admissible, and otherwise. Markov chain Monte Carlo (MCMC) methods provide tools to sample unnormalized posteriors. Classical MCMC methods, such as Hamiltonian Monte Carlo or Metropolis–Hastings  construct a chain of -dimensional states which evolve to be distributed in accordance with the target distribution . After sampling an initial state from the prior distribution (9), the chain advances from one state to the next by means of a transition operator that varies with the method employed . More recent ensemble MCMC samplers [13, 18] draw initial states from the prior distribution (the “walkers” or “particles”) and transition to new states while mixing the previous ones to generate several chains. This approach allows for parallelization and can handle multimodal posteriors .
Figure 5 illustrates the results in a two-dimensional geometry, to reduce the computational cost in the tests. A few million samples were generated, which requires solving an identical number of forward problems. In two-dimensional set-ups we replace the stationary transmission problem for the Maxwell equations by a transmission problem posed for the Helmholtz equation . Assuming is piecewise constant, we resort to fast boundary elements to solve the Hemholtz transmission problems in two dimensions . Once a large enough collection of samples is generated , we extract statistical information describing the imaged object: the most likely shapes, sizes, locations, as well as uncertainty in the predictions. While starshaped two-dimensional objects can be reasonably characterized with 10–20 parameters, three-dimensional objects require 80–90. Full characterization of the posterior probability by MCMC sampling becomes more expensive as the number of parameters and the time required to solve forward problems increase.
4.4 Laplace approximation
The full characterization of the posterior probability is a challenging and costly probability problem for moderate- and high-dimensional parameters . Low-cost approximations of the posterior distribution often rely on finding the maximum a posteriori (MAP) point, that is, the set of parameters that maximize the posterior probability. Upon taking logarithms, maximizing the posterior probability of the parameter set given the data is equivalent to minimizing the regularized cost functional 
This is a nonlinear least-squares problem of the form previously considered in deterministic inversion, including regularization terms provided by the prior knowledge. We can solve it efficiently by using an adapted Levenberg–Marquardt–Fletcher iterative scheme . Starting from , we set , where is the solution of
Here, is the Gauss–Newton approximation to the Hessian of the functional (10) and is its gradient, while is a scaling factor for that balances the different orders of magnitude of the two terms defining the cost in the first iterations, and becomes equal to 1 at a certain point. At each step, the adjustable parameter increases until the cost decreases, and decreases otherwise, making the iteration closer to Gauss–Newton or gradient schemes as required.
Linearization about the resulting MAP point (the so-called Laplace approximation) provides an approximation of the posterior distribution by a Gaussian with mean and posterior covariance . Sampling this Gaussian, we extract statistical information representing the dominant mode at a much lower computational cost, see Figure 6. Reaching takes about 20 steps of scheme (11). The whole process, sampling included, is finished in a few minutes, instead of a few days.
We have considered fixed and known in these tests. In case it is constant and unknown, it becomes an additional parameter included in . In the end, we obtain additional histograms reflecting uncertainty about the value with highest probability .
Digital holography poses challenging inverse problems which provide an opportunity to develop and test a variety of analytical and computational tools. First guesses of imaged objects are obtained by calculating the topological derivative of misfit functionals comparing the true hologram and the synthetic holograms that would be generated for different object configurations according to the selected forward model. Such guesses are robust to noise in the data. To reduce dimensionality, one can characterize the imaged objects by means of starshaped parametrizations. In a deterministic framework, we have shown that hybrid schemes combining iteratively regularized Gauss–Newton methods with topological derivative initializations and updates lead to good reconstructions of simple object configurations in a few steps, using stopping criteria that take into account the expected level of noise in the data. We are able to quantify uncertainty in such predictions by resorting to Bayesian formulations with topological priors. In two dimensions, Markov chain Monte Carlo methods provide a complete characterization of the posterior probability of the observed hologram being that is generated by a few starshaped objects. Three-dimensional tests are affordable for very simple shapes, such as a sphere or a cylinder [11