Optimality of the maximum likelihood estimator in astrometry

Sebastian Espinosa; Jorge F. Silva; Rene A. Mendez; Rodrigo Lobos; Marcos Orchard

doi:10.1051/0004-6361/201732537

Home

All issues

Volume 616 (August 2018)

A&A, 616 (2018) A95

Full HTML

Free Access

Issue		A&A Volume 616, August 2018


Article Number		A95
Number of page(s)		21
Section		Celestial mechanics and astrometry
DOI		https://doi.org/10.1051/0004-6361/201732537
Published online		31 August 2018

A&A 616, A95 (2018)

Optimality of the maximum likelihood estimator in astrometry

Sebastian Espinosa¹, Jorge F. Silva¹, Rene A. Mendez², Rodrigo Lobos³ and Marcos Orchard¹

¹ Information and Decision Systems Group, Department of Electrical Engineering, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Beauchef 850, Santiago, Chile
e-mail: sebastian.espinosa@ing.uchile.cl, josilva@ing.uchile.cl
² Departamento de Astronomía, Facultad de Ciencias Físicas y Matemáticas, Universidad de Chile, Casilla 36-D, Santiago, Chile
e-mail: rmendez@uchile.cl
³ Department of Electrical Engineering, University of Southern California, 90007 CA, USA

Received: 22 December 2017
Accepted: 4 May 2018

Abstract

Context. Astrometry relies on the precise measurement of the positions and motions of celestial objects. Driven by the ever-increasing accuracy of astrometric measurements, it is important to critically assess the maximum precision that could be achieved with these observations.

Aims. The problem of astrometry is revisited from the perspective of analyzing the attainability of well-known performance limits (the Cramér–Rao bound) for the estimation of the relative position of light-emitting (usually point-like) sources on a charge-coupled device (CCD)-like detector using commonly adopted estimators such as the weighted least squares and the maximum likelihood.

Methods. Novel technical results are presented to determine the performance of an estimator that corresponds to the solution of an optimization problem in the context of astrometry. Using these results we are able to place stringent bounds on the bias and the variance of the estimators in close form as a function of the data. We confirm these results through comparisons to numerical simulations under a broad range of realistic observing conditions.

Results. The maximum likelihood and the weighted least square estimators are analyzed. We confirm the sub-optimality of the weighted least squares scheme from medium to high signal-to-noise found in an earlier study for the (unweighted) least squares method. We find that the maximum likelihood estimator achieves optimal performance limits across a wide range of relevant observational conditions. Furthermore, from our results, we provide concrete insights for adopting an adaptive weighted least square estimator that can be regarded as a computationally efficient alternative to the optimal maximum likelihood solution.

Conclusions. We provide, for the first time, close-form analytical expressions that bound the bias and the variance of the weighted least square and maximum likelihood implicit estimators for astrometry using a Poisson-driven detector. These expressions can be used to formally assess the precision attainable by these estimators in comparison with the minimum variance bound.

Key words: astrometry / methods: statistical / methods: analytical

© ESO 2018

1. Introduction

Astrometry, which deals with the accurate and precise measurement of the positions and motions of celestial objects, is the oldest branch of observational astronomy, dating back at least to Hipparchus of Nicaea in 190 BC. Since, from its very beginnings, this branch of astronomy has required measurements over time to fulfil its goals, it could be considered the precursor of the currently fashionable “time-domain astronomy”, preceding it by at least 20 centuries. In recent years, astrometry has experienced a “coming of age” motivated by the rapid increase in positional precision allowed by the use of all-digital techniques and space observatories¹.

A number of techniques have been proposed to estimate the location and flux of celestial sources as recorded on digital detectors, such as charge-coupled devices (CCD). In this context, estimators based on the use of a least squares (LS) error principle have been widely adopted (King 1983; Stetson 1987; Alard & Lupton 1998). The use of this type of decision rule has been traditionally justified through heuristic reasons. First, LS methods are conceptually straightforward to formulate based on the observation model of these problems. Second, they offer computationally efficient implementations and have shown reasonable performance (Lee & Van Altena 1983; Stone1989; Vakili & Hogg 2016). Finally, the LS approach was the classical method used when the observations were obtained with analog devices (Van Altena & Auer 1975; Auer & Van Altena 1978), which are well characterized by a Gaussian noise model for the observations. In the Gaussian case the LS is equivalent to the maximum likelihood (ML) solution (Chun-Lin 1993; Gray & Davisson 2004; Cover & Thomas 2012), and, consequently, the LS method was taken from the analog to the digital observational (Poisson noise model) setting.

Considering astrometry as an inference problem (of, usually, point sources), the astrometric community has been interested for a long time in understanding the fundamental performance limits (or information bounds) of this task (Lindegren 2010 and references therein). It is well understood by the community that the characterization of this precision limit offers the possibility of understanding the complexity of the task and how it depends on key attributes of the problem, like the quality of the observational site, the performance of the instrument (CCD), and the details of the experimental conditions (Mendez et al. 2013, 2014). On the other hand, it provides meaningful benchmarks to define the optimality of practical estimators in the process of comparing their performance with the bounds (Lobos et al. 2015).

Concerning the characterization and analysis of fundamental performance bounds, we can mention some works on the use of the parametric Cramér–Rao (CR) bound by Lindegren (1978), Jakobsen et al. (1992), Zaccheo et al. (1995), Adorf (1996), and Bastian (2004). The CR bound is a minimum variance bound (MVB) for the family of unbiased estimators (Rao, 1945; Cramér, 1946). In astrometry and joint photometry and astrometry, Mendez et al. (2013, 2014) have recently studied the structure of this bound, and have analyzed its dependency with respect to important observational parameters under realistic (ground-based) astronomical observing conditions. In this context, closed-form expressions for the Cramér–Rao bound were derived in a number of important settings (high pixel resolution and low and high signal-to-noise (S/N) regimes), and their trends were explored across different CCD pixel resolutions and the position of the object in the CCD array. As an interesting outcome of those studies, the analysis of the CR bound has allowed us to predict the optimal pixel resolution of the array, as well as providing a formal justification for some heuristic techniques commonly used to improve performance in astrometry, like dithering for undersampled images (Mendez et al. 2013, Sect. 3.3). Recently, an application of the CR bound to moving sources has been done by Bouquillon et al. (2017), indicating excellent agreement between our theoretical predictions, simulations, and actual ground-based observations of the Gaia satellite, in the context of the Gaia Ground Based Optical Tracking (GBOT) program (Altmann et al. 2014). The use of the CR bound on other applications is also of interest, for example, in assessing the performance of star trackers to guide satellites with demanding pointing constraints (Zhang et al. 2016), or to meaningfully compare positional differences from different catalogs (for example involving the Sloan Digital Sky Survey (SDSS) and Gaia, see Lemon et al. 2017). Finally, a formulation of the (non-parametric) Bayesian CR bound in astrometry, using the so-called “Van-Trees inequality” (Van Trees, 2004), has been presented by our group in Echeverria et al. (2016). This approach is particularly well suited for objects at the edge of detectability, and where some prior information is available, and has been proposed for the analysis of Gaia data for faint sources, or for those with a poor observational history (Michalik et al. 2015; Michalik & Lindegren, 2016).

From the perspective of astrometric estimators, Lobos et al. (2015) have studied in detail the performance of the widely adopted LS estimator. In particular extending the result in So et al. (2013), Lobos and collaborators derived lower and upper bounds for the mean square error (MSE) of the LS estimator. Using these bounds, the optimality of the LS estimator was analyzed, demonstrating that for high S/N there is a considerable gap between the CR bound and the performance of the LS estimator (indicating a lack of optimality of this estimator). This work showed that for the very low S/N observational regime (weak astronomical sources), the LS estimator is near optimal, as its performance closely follows the CR bound. The limitations of the LS method in the medium to high S/N regime proved in that work opens up the question of studying alternative estimators that could achieve the CR bound on these regimes, which is the main focus of this paper, as outlined below.

2. Contribution and organization

In this work we study the ML estimator in astrometry, motivated by its well-known optimality properties in a classical parametric estimation setting with independent and identically distributed measurements (i.i.d.; Kendall et al. 1999). We know that in the i.i.d. case this estimator is efficient with respect to the CR limit (Kay 1993), but it is important to emphasize that the observational setting of astrometry deviates from the classical i.i.d. case and, consequently, the analysis of its optimality is still an open problem. In particular, we face the technical challenge of evaluating its performance, a problem that, to the best of our knowledge, has not been addressed by the astrometric community. Concerning the independent but not identically distributed case, Bradley & Gart (1962) and Hoadley (1971) give conditions under which ML estimators are consistent² and asymptotically normal³. Those conditions, however, are technically difficult to proof in this context.

The main challenge here is the fact that, as in the case of the LS estimator (Lobos et al. 2015), the ML estimator is the solution of an optimization problem with a nonconvex cost function of the data. This implies that it is not possible to directly compute the performance of the method. To address this technical issue, we extend the approach proposed by Fessler (1996) to approximate the variance and the mean of an implicit estimator solution of a generic optimization problem of the data through the use of a Taylor approximation around the mean measurement (see Theorem 1 below). Our extension considers high order approximations of the function that allow us not only to estimate the performance of the ML estimator through an explicit nominal value, but also provides a confidence interval around it. With this result we revisit the more general weighted least square (WLS) and ML methods providing specific upper and lower bounds for both methods (see Theorems 2 and 3). The main findings from our analysis of the bounds are two-fold: first we show that the WLS exhibits a sub-optimality similar to that of the LS method for medium to high S/N regimes discovered by Lobos et al. (2015) and, second, that the ML estimator achieves the CR limit for medium to high S/N and, consequently, it is optimal on those regimes. This last result is remarkable because, in conjunction with the result presented in Lobos et al. (2015), we are able to identify estimators that achieve the fundamental performance limits of astrometry in all the S/N regimes for the problem.

The paper is organized as follows: Sect. 3 introduces the background, preliminaries, and notation of the problem. Section 4 presents the main methodological contribution of this work. Section 5 presents the application to astrometry considering the WLS and the ML schemes. Finally, numerical analysis of these performance bounds are presented in Sect. 6 and the final remarks and conclusion are given in Sect. 7.

3. Preliminaries and background

We begin by introducing the problem of astrometry. For simplicity, we focus on the one-dimensional (1D) scenario of a linear array detector, as it captures the key conceptual elements of the problem⁴.

3.1. Relative astrometry as a parameter estimation problem

The main problem at hand is the inference of the relative position (in the array) of a point source. This source is modeled by two scalar quantities, the position of object x _c ∈ ℝ in the array⁵, and its intensity (or brightness, or flux) that we denote by F̃ ∈ ℝ⁺ . These two parameters induce a probability distribution $μ_{x_{c}, \tilde{F}}$ $\mu _{x_\mathrm{{c}},\tilde{F}}$ over an observation space that we denote by 𝕏. Formally, given a point source represented by the pair (x _c, F̃), it creates a nominal intensity profile in a photon integrating device (PID), typically a CCD, which can be expressed by $\begin{matrix} {\tilde{F}}_{x_{c}, \tilde{F}} (x) = \tilde{F} \cdot ϕ (x - x_{c}, σ), \end{matrix}$ $\begin{aligned} \tilde{F}_{x_\mathrm{{c}}, \tilde{F}}(x)=\tilde{F} \cdot \phi (x-x_\mathrm{{c}},\sigma ), \end{aligned}$ (1)

where ϕ(x−x _c,σ) denotes the one-dimensional normalized point spread function (PSF) and where σ is a generic parameter that determines the width (or spread) of the light distribution on the detector (typically a function of wavelength and the quality of the observing site, see Sect. 6; Mendez et al. 2013, 2014).

The profile in Eq. (1) is not measured directly, but it is observed through three sources of perturbations. First, an additive background noise that captures the photon emissions of the open (diffuse) sky, and the noise of the instrument itself (the read-out noise and dark-current Janesick 2001, 2007; Howell 2006; McLean 2008), modeled by B̃ _i in Eq. (2). Second, an intrinsic uncertainty between the aggregated intensity (the nominal object brightness plus the background) and actual measurements, which is modeled by independent random variables that follow a Poisson probability law. Finally, we need to account for the spatial quantization process associated with the pixel resolution of the PID as specified in Eqs. (2) and (3). Modeling these effects, we have a countable collection of independent and non-identically distributed random variables (observations or counts) {I _i : i ∈ ℤ, where I _i ∼ Poisson(λ _i(x _c, F̃)), driven by the expected intensity at each pixel element i, given by $\begin{matrix} λ_{i} (x_{c}, \tilde{F}) \equiv E {I_{i}} = \underset{\equiv {\tilde{F}}_{i} (x_{c}, \tilde{F})}{\underset{⏟}{\tilde{F} \cdot g_{i} (x_{c})}} + {\tilde{B}}_{i}, \forall i \in Z \end{matrix}$ $\begin{aligned} \lambda _\mathrm{i}(x_\mathrm{c}, \tilde{F}) \equiv \mathbb E \{I_\mathrm{i}\}= \underbrace{\tilde{F} \cdot g_\mathrm{i}(x_\mathrm{c})}_{\equiv \tilde{F}_\mathrm{i}(x_\mathrm{c},\tilde{F})} + \tilde{B}_\mathrm{i},~\forall i\in \mathbb Z \end{aligned}$ (2)

and $\begin{matrix} g_{i} (x_{c}) \equiv \int_{x_{i} - Δ x / 2}^{x_{i} + Δ x / 2} ϕ (x - x_{c}, σ) d x, \forall i \in Z, \end{matrix}$ $\begin{aligned} g_\mathrm{i}(x_\mathrm{c}) \equiv \int ^{x_\mathrm{i}+\Delta x/2}_{x_\mathrm{i}-\Delta x/2} \phi (x- x_\mathrm{c},\sigma )~\mathrm{d} x, \ \forall i \in \mathbb Z , \end{aligned}$ (3)

where 𝔼{} is the expectation value of the argument and {x _i : i ∈ ℤ} denotes the standard uniform quantization of the real line array with resolution ∆x > 0, that is, x _i+1 − x _i = Δx for all i ∈ ℤ. In practice, the PID has a finite collection of measurement elements (or pixels) I ₁,…,I_n, then a basic assumption here is that we have a good coverage of the object of interest, in the sense that for a given position x _c $\begin{matrix} \sum_{i = 1}^{n} g_{i} (x_{c}) \approx \sum_{i \in Z} g_{i} (x_{c}) = \int_{- \infty}^{\infty} ϕ (x - x_{c}, σ) d x = 1 . \end{matrix}$ $\begin{aligned} \sum _{i=1}^n g_\mathrm{i}(x_\mathrm{c}) \approx \sum _{i\in \mathbb Z } g_\mathrm{i}(x_\mathrm{c}) =\int _{-\infty }^{\infty } \phi (x-x_\mathrm{c},\sigma )~\mathrm{d} x = 1. \end{aligned}$ (4)

At the end, the likelihood (probability) of the joint observations Iⁿ = (I₁,…,I_n) (with values in ℕⁿ) given the source parameters (x _c, F̃) is given by $\begin{matrix} L (I^{n} ; x_{c}, \tilde{F}) = f_{λ_{1} (x_{c}, \tilde{F})} (I_{1}) \cdot f_{λ_{2} (x_{c}, \tilde{F})} (I_{2}) \dots f_{λ_{n} (x_{c}, \tilde{F})} (I_{n}), \forall I^{n} \in N^{n}, \end{matrix}$ $\begin{aligned} L({I^n; x_\mathrm{c},\tilde{F}}) = f_{\lambda _1(x_\mathrm{c},\tilde{F})}(I_1) \cdot f_{\lambda _2(x_\mathrm{c},\tilde{F})}(I_2) \cdots f_{\lambda _n(x_\mathrm{c},\tilde{F})}(I_n), \ \forall I^n \in \mathbb N ^n, \end{aligned}$ (5)

where $f_{λ} (x) = \frac{e^{- λ} \cdot λ^{x}}{x!}$ $f_{\lambda }(x)=\frac{e^{-\lambda }\cdot \lambda ^x}{x!}$ denotes the probability mass function (PMF) of the Poisson law (Gray & Davisson, 2004).

Finally, if F̃ is assumed to be known⁶, the astrometric estimation is the task of defining a decision rule τ_n() : ℕⁿ → Θ with Θ = ℝ being the parameter space, where given an observation I ⁿ the estimated position is given by x̂_c(Iⁿ ) = τ_n (Iⁿ ).

3.2. The Cramér–Rao bound

In astrometry the Cramér–Rao bound has been used to bound the variance (estimation error) of any unbiased estimator (Mendez et al. 2013, 2014). In general, let Iⁿ be a collection of independent observations that follow a parametric PMF $f_{\bar{θ}}$ $f_{\bar{\theta }}$ defined on ℕ. The parameters to be estimated from Iⁿ will be denoted in general by the vector $\bar{θ} = (θ_{1}, θ_{2}, \dots, θ_{m}) \in Θ = R^{m}$ $\bar{\theta }=(\theta _1,\theta _2,\ldots ,\theta _m) \in \Theta = \mathbb R ^m$ . Let τ_n(Iⁿ ) : ℕⁿ → Θ be an unbiased estimator⁷ of $\bar{θ}$ $\bar{\theta }$ , and $L (I^{n} ; \bar{θ}) = f_{\bar{θ}} (I_{1}) \cdot f_{\bar{θ}} (I_{2}) \dots f_{\bar{θ}} (I_{n})$ $L( I^n;\bar{\theta })= f_{\bar{\theta }}(I_1)\cdot f_{\bar{\theta }}(I_2)\cdots f_{\bar{\theta }}(I_n)$ be the likelihood of the observation $I^{n} \in N^{n}$ $I^n \in \mathbb N ^n$ given $\bar{θ} \in Θ$ $\bar{\theta }\in \Theta$ . Then, the Cramér–Rao bound (Rao 1945; Cramér 1946) establishes that if $\begin{matrix} E_{I^{n} \sim f_{\bar{θ}}^{n}} {\frac{\partial ln L (I^{n} ; \bar{θ})}{\partial θ_{i}}} = 0, \forall i \in {1, \dots, m}, \end{matrix}$ $\begin{aligned} \mathbb E _{I^n \sim f^n_{\bar{\theta }}}\left\{ \frac{\partial \ln L(I^n; \bar{\theta }) }{{\partial } \theta _\mathrm{i}} \right\} = 0, \;\; \forall i \in \left\{ 1,\ldots , m \right\} \!, \end{aligned}$ (6)

then, the variance (denoted by Var), satisfies that $\begin{matrix} Var (τ_{n} {(I^{n})}_{i}) \geq {[I_{\bar{θ}} {(n)}^{- 1}]}_{i, i}, \end{matrix}$ $\begin{aligned} \mathrm{Var} (\tau _n(I^n)_\mathrm{i}) \ge [ \mathcal I _{\bar{\theta }}(n)^{-1} ]_{i,i}, \end{aligned}$ (7)

where $I_{\bar{θ}} (n)$ $\mathcal I _{\bar{\theta }}(n)$ is the Fisher information matrix given by, ∈i, j ∈ {1,…,m}, $\begin{matrix} {[I_{\bar{θ}} (n)]}_{i, j} = E_{I^{n} \sim f_{\bar{θ}}^{n}} {\frac{\partial ln L (I^{n} ; \bar{θ})}{\partial θ_{i}} \cdot \frac{\partial ln L (I^{n} ; \bar{θ})}{\partial θ j}} \cdot \end{matrix}$ $\begin{aligned} \, [ \mathcal I _{\bar{\theta }}(n)]_{i,j} = \mathbb E _{I^n \sim f^n_{\bar{\theta }}} \left\{ \frac{\partial \ln L(I^n; \bar{\theta })}{\partial \theta _\mathrm{i}} \cdot \frac{\partial \ln L(I^n; \bar{\theta }) }{\partial \theta \mathrm{j}} \right\} \!\cdot \end{aligned}$ (8)

In particular, for the scalar case (m = 1), we have that for all θ ∈ Θ, $\begin{matrix} min_{τ_{n} (\cdot) \in T^{n}} Var (τ_{n} (I^{n})) \geq I_{θ} {(n)}^{- 1} = E_{I^{n} \sim f_{θ}^{n}} {[{(\frac{d ln L (I^{n} ; θ)}{d θ})}^{2}]}^{- 1}, \end{matrix}$ $\begin{aligned} \min_{\tau_n(\cdot )\in \mathcal T^n} \mathrm{Var}(\tau _n(I^n)) \ge \mathcal I _\theta (n)^{-1}= \mathbb E _{I^n \sim f^n_{\theta }} \left\{ \left[ \left(\frac{\mathrm{d} \ln L(I^n; {\theta })}{\mathrm{d} \theta } \right)^2 \right]\right\} ^{-1}, \end{aligned}$ (9)

where $T^{n}$ $\mathcal T^n$ is the collection of all unbiased estimators and $I^{n} \sim f_{θ}^{n}$ $I^n \sim f^n_\theta$ . For astrometry, Mendez et al. (2013, 2014) have characterized and analyzed the Cramér–Rao bound, leading to

Proposition 1

(Mendez et al. 2014, Sect. 2.4) If F̃ ∈ ℝ⁺ is fixed and known, and we want to estimate x _c from $I^{n} \sim f_{(x_{c}, \tilde{F})} = L (I^{n} ; x_{c}, \tilde{F})$ $I^n \sim f_{(x_\mathrm{c},\tilde{F})}=L({I^n; x_\mathrm{c},\tilde{F}})$ in Eq. (5), then the Fisher information is given by $\begin{matrix} I_{x_{c}} (n) = \sum_{i = 1}^{n} \frac{{(\tilde{F} \frac{d g_{i} (x_{c})}{d x_{c}})}^{2}}{\tilde{F} g_{i} (x_{c}) + {\tilde{B}}_{i}}, \end{matrix}$ $\begin{aligned} \mathcal I _{x_\mathrm{c}}(n) = \sum _{i=1}^n \frac{ \left( \tilde{F}\frac{\mathrm{d} g_i(x_\mathrm{c})}{\mathrm{d} x_\mathrm{c}} \right)^2 }{\tilde{F} g_i(x_\mathrm{c}) + \tilde{B}_i}, \end{aligned}$ (10)

which from Eq. (9) induces a MVB for the astrometric estimation problem, and where $σ_{CR}^{2} (n) \equiv I_{x_{c}} {(n)}^{- 1}$ $\sigma _\mathrm{CR}^2(n) \equiv \mathcal I _{x_\mathrm{c}}(n)^{-1}$ denotes the (astrometric) CR bound.

3.3. Achievability and performance of the LS estimator

Concerning the achievability of the CR bound with a practical estimator, Lobos et al. (2015, Proposition 2) have demonstrated that this bound cannot be attained, meaning that for any unbiased estimator τ_n (·) we have that $\begin{matrix} Var (τ_{n} (I^{n})) > σ_{CR}^{2}, \end{matrix}$ $\begin{aligned} \mathrm{Var}(\tau _n(I^n)) > \sigma _\mathrm{CR}^2, \end{aligned}$ (11)

where Iⁿ follows the Poisson PMF $f_{(x_{c}, \tilde{F})}$ $f_{(x_\mathrm{c},\tilde{F})}$ in Eq. (5).

This finding should be interpreted with caution, considering its pure theoretical meaning. This is because Eq. (11) does not exclude the possibility that the CR bound could be approximated arbitrarily closely by a practical estimation scheme. Motivated by this refined conjecture, Lobos et al. (2015) proposed to study the performance of the widely adopted LS estimator⁸ with the goal of deriving operational upper and lower performance bounds of its performance that could be used to determine how far this scheme could depart from the CR limit. Then, from this result, it was possible to evaluate the goodness of the LS estimator for concrete observational regimes. To bound the performance of the LS estimator, the challenge was that τ _LS(Iⁿ ) is an implicit function of the data (where no close-form expression is available) and, consequently, Lobos et al. (2015) derived a result to bound the estimation error and the variance of τ _LS(Iⁿ ). We can briefly summarize the main result presented in Lobos et al. (2015), Theorem 1) by saying that under certain mild sufficient conditions (that were shown to be realistic for astrometry), there is a constant δ > 0 (that depends on the observational regime, in particular the S/N) and a nominal variance $σ_{LS}^{2}$ $\sigma ^2_\mathrm{LS}$ , which is determined in closed-form in the result, from which it is possible to bound (τ _LS(Iⁿ ) by the simple expression $\begin{matrix} Var (τ_{LS} (I^{n})) \in (\frac{σ_{LS}^{2} (n)}{{(1 + δ)}^{2}}, \frac{σ_{LS}^{2} (n)}{{(1 - δ)}^{2}}), \end{matrix}$ $\begin{aligned} \mathrm{Var}(\tau _\mathrm{LS}(I^n)) \in \left( \frac{\sigma ^2_\mathrm{LS}(n)}{(1+\delta )^2}, \frac{\sigma ^2_\mathrm{LS}(n)}{(1-\delta )^2} \right), \end{aligned}$ (12)

where $\begin{matrix} σ_{LS}^{2} (n) = \frac{\sum_{i = 1}^{n} (\tilde{F} g_{i} (x_{c}) + {\tilde{B}}_{i}) \cdot {(g_{i}^{'} (x_{c}))}^{2}}{{(\tilde{F} \sum_{i = 1}^{n} {(g_{i}^{'} (x_{c}))}^{2})}^{2}} \cdot \end{matrix}$ $\begin{aligned} \sigma ^2_\mathrm{LS}(n) = \frac{\sum _{i=1}^n (\tilde{F}g_\mathrm{i}(x_\mathrm{c})+\tilde{B}_\mathrm{i}) \cdot (g_\mathrm{i} {\prime }(x_\mathrm{c}))^2}{\left(\tilde{F}\sum _{i=1}^n (g_\mathrm{i} {\prime }(x_\mathrm{c}))^2\right)^2}\cdot \end{aligned}$ (13)

We note that when δ is small, $σ_{LS}^{2} (n)$ $\sigma ^2_\mathrm{LS}(n)$ tightly determines the performance of the LS estimator, and its comparison with $σ_{CR}^{2}$ $\sigma ^2_\mathrm{CR}$ can be used to evaluate the goodness of the LS estimator for astrometry. Based on a careful comparison, it was shown in Lobos et al. (2015, Sect. 4) that, in general, $σ_{LS}^{2} (n)$ $\sigma ^2_\mathrm{LS}(n)$ is close to $σ_{CR}^{2} (n)$ $\sigma ^2_\mathrm{CR}(n)$ for the small S/N regime of the problem. However, for moderate to hig h S/N regimes, the gap between $σ_{LS}^{2} (n)$ $\sigma ^2_\mathrm{LS}(n)$ and $σ_{CR}^{2} (n)$ $\sigma ^2_\mathrm{CR}(n)$ becomes quite significant⁹.

These unfavorable findings for the LS method have motivated us to study alternative schemes that could potentially approach better the Cramér–Rao bound for the rich observational context of medium to high S/N regimes. This will be the focus of the following sections, where in particular we explore the performance of the ML and WLS estimators, thus extending and generalizing the analysis done for the LS estimator by our group presented in Lobos et al. (2015).

4. Bounding the performance of an implicit estimator

Before we go to the case of the WLS and the ML estimators, we present a general result that bounds the performance of any estimator that is the solution of a generic optimization problem. Let us consider a vector of observations Iⁿ = (I₁,…I_n) ∈ ℝⁿ and a general so-called cost function J(α, Iⁿ ). Then the estimation of x_c from the data is the solution of the optimization problem $\begin{matrix} τ_{J} (I^{n}) \equiv arg min_{α \in R} J (α, I^{n}), \end{matrix}$ $\begin{aligned} \tau _{ J}(I^n) \equiv \arg \min _{\alpha \in \mathbb R } J(\alpha ,I^n), \end{aligned}$ (14)

where α represents the position of the object in the context of astrometry. As in our previous work (Lobos et al. 2015), the challenge here is that this estimator is implicit because no closed-form expression of the data which solves Eq. (14) is assumed. In particular, this implies that both the variance and the estimation error of τ_J(Iⁿ) cannot be determined directly. To address this technical issue, we extend the approach proposed by Fessler (1996) to approximate the variance and the mean of an implicit estimator solution of a problem described by Eq. (14) through the use of a Taylor approximation around the mean measurement, that is, ${\bar{I}}^{n} = E_{I^{n} \sim f_{(x_{c}, \tilde{F})}} (I^{n})$ $\bar{I}^n=\mathbb E _{I^n\sim f_{(x_\mathrm{c},\tilde{F})}}(I^n)$ .

More precisely, we assume that J(α,Iⁿ ) has a unique global minimum at τ_J (Iⁿ ), and that it has a regular behavior, so its partial derivatives are zero, that is, $\begin{matrix} 0 = {\frac{\partial}{\partial α} J (α, I^{n}) |}_{α = τ_{J} (I^{n})} \equiv \frac{\partial}{\partial α} J (τ_{J} (I^{n}), I^{n}) . \end{matrix}$ $\begin{aligned} 0=\left.\frac{\partial }{\partial \alpha }J(\alpha ,I^n)\right|_{\alpha =\tau _{ J}(I^n)}\equiv \frac{\partial }{\partial \alpha }J(\tau _{ J}(I^n),I^n). \end{aligned}$ (15)

Then we can obtain τ_J (Iⁿ ), by a first order Taylor expansion around the mean ${\bar{I}}^{n}$ $\bar{I}^n$ by $\begin{matrix} e (\bar{I}, I - \bar{I}) \equiv \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ_{J} ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j), \end{matrix}$ $\begin{aligned} e(\bar{I},I-\bar{I}) \equiv \frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2}{\partial I_\mathrm{i}\partial I\mathrm{j}}\tau _{ J}(\bar{I}^n-t(I^n-\bar{I}^n))(I_\mathrm{i}-\bar{I}_\mathrm{i})(I\mathrm{j}-\bar{I}\mathrm{j}), \end{aligned}$ (16) $\begin{matrix} τ_{J} (I^{n}) = τ_{J} ({\bar{I}}^{n}) + \sum_{i = 1}^{n} \frac{\partial}{\partial I_{i}} τ_{J} ({\bar{I}}^{n}) (I_{i} - {\bar{I}}_{i}) + e (\bar{I}, I - \bar{I}), \end{matrix}$ $\begin{aligned} \tau _{ J}(I^n)=\tau _{ J}(\bar{I}^n)+\sum _{i=1}^n\frac{\partial }{\partial I_\mathrm{i}}\tau _{ J}(\bar{I}^n)(I_\mathrm{i}-\bar{I}_\mathrm{i})+e(\bar{I},I-\bar{I}), \end{aligned}$ (17)

where t ∈ [0,1] is fixed but unknown¹⁰. For simplicity, Eq. (17) can be written in matrix form as $\begin{matrix} τ_{J} (I^{n}) = τ_{J} ({\bar{I}}^{n}) + \nabla τ_{J} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n}) + e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n}), \end{matrix}$ $\begin{aligned} \tau _{ J}(I^n)= \tau _{ J}(\bar{I}^n)+\nabla \tau _{ J}(\bar{I}^n)\cdot (I^n-\bar{I}^n)+ e(\bar{I}^n,I^n-\bar{I}^n), \end{aligned}$ (18)

where $\nabla = [\frac{\partial}{\partial I_{1}} \dots \frac{\partial}{\partial I_{n}}]$ $\nabla =[\frac{\partial }{\partial I_1}\ldots \frac{\partial }{\partial I_n}]$ denotes the row gradient operator and $e ({\bar{I}}^{n}, I - {\bar{I}}^{n})$ $e(\bar{I}^n,I-\bar{I}^n)$ is the residual error of the Taylor expansion. From Eq. (18) we can readily obtain the expression for its variance $\begin{matrix} σ_{J}^{2} (n) \equiv \nabla τ_{J} ({\bar{I}}^{n}) Co v {I^{n}} \nabla τ_{J} {({\bar{I}}^{n})}^{T} \end{matrix}$ $\begin{aligned} \sigma ^2_{ J}(n)\equiv \nabla \tau _{ J}(\bar{I}^n)\mathrm{Co}v\{I ^n \}\nabla \tau _{ J}(\bar{I}^n)^T \end{aligned}$ (19) $\begin{matrix} γ_{J} (n) \equiv Var {e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} + 2 Co v {\nabla τ_{J} ({\bar{I}}^{n}) (I^{n} - {\bar{I}}^{n}), e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} \end{matrix}$ $\begin{aligned} \gamma _{ J} (n)\equiv \mathrm{Var}\{ e(\bar{I}^n,I^n-\bar{I}^n) \}+2\mathrm{Co}v\{\nabla \tau _{ J}(\bar{I}^n) (I^n-\bar{I}^n),e(\bar{I}^n,I^n-\bar{I}^n)\} \end{aligned}$ (20) $\begin{matrix} Var {τ_{J} (I^{n})} = σ_{J}^{2} (n) + γ_{J} (n) . \end{matrix}$ $\begin{aligned} \mathrm{Var}\{\tau _{ J}(I^n)\} = \sigma ^2_{ J}(n) +\gamma _{ J} (n). \end{aligned}$ (21)

In Eq. (21) we recognize two terms: $σ_{J}^{2} (n)$ $\sigma ^2_{ J}(n)$ that captures the linear behavior of τ_J (·) around ${\bar{I}}^{n}$ $\bar{I}^n$ and γ_J (n) which reflects the deviation from this linear trend. It should be noted that the above expression does not depend on τ(Iⁿ) itself, but on its partial derivatives evaluated at the mean vector of observations. Then in the adoption of this approach to estimate Var{τ_J (Iⁿ )}, a key task is to determine $\nabla τ_{J} ({\bar{I}}^{n})$ $\nabla \tau _{J}(\bar{I}^{n})$ .

Remark 1

It is meaningful to note that Fessler (1996) only considered the linear term in his approximate analysis, obviating the residual term γ_J (n) in Eq. (21). This first order reduction is not realistic for our problem because the solution of a problem like the one posed by Eq. (14) in astrometry has important non-linear components that need to be considered in the analysis of Eq. (21).

In an effort to analyze both the linear and non-linear aspects of a general intrinsic estimator solution to Eq. (14), the following result offers sufficient conditions to determine $σ_{J}^{2} (n)$ $\sigma ^2_{ J}(n)$ in closed-form, and to bound the magnitude of the residual term γ_J (n) in Eq. (21).

Theorem 1

Let us consider a fixed and unknown parameter x _c ∈ ℝ, the observations Iⁿ = (I₁,…,I_n)^T where $I_{i} \sim f_{x_{c}}$ $I_\mathrm{i} \sim f_{x_\mathrm{c}}$ , and τ_J (Iⁿ ) the estimator solution of Eq. (14). If we satisfy the following two rather general conditions

(a)
the cost function J(α, Iⁿ ) is twice differentiable with respect to Iⁿ and x _c, and the gradient of τ_J (·) evaluated in the mean data ${\bar{I}}^{n}$ $\bar{I}^n$ offers the following decomposition $\begin{matrix} \nabla τ_{J} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n}) = a \sum_{i = 1}^{N} b_{i} (I_{i} - {\bar{I}}_{i}) \end{matrix}$ $\nabla {{\tau }_{J}}({{\bar{I}}^{n}})\cdot ({{I}^{n}}-{{\bar{I}}^{n}})=a\sum\limits_{i=1}^{N}{{{b}_{\text{i}}}({{I}_{\text{i}}}-{{{\bar{I}}}_{\text{i}}})}$ (22) with a and {b _i : i ∈ {1,…,N}} constants, and,
(b)
the estimator evaluated in the mean data equals the true parameter x _c ∈ ℝ, this is, $\begin{matrix} τ_{J} ({\bar{I}}^{n}) = x_{c}, \end{matrix}$ $\begin{aligned} \tau _{ J}(\bar{I}^n) =x_\mathrm{c}, \end{aligned}$ (23)

then we can define two new quantities ϵ_J (n) and β_J (n) (both > 0) and

σ_{J}^{2} (n)

$\sigma ^2_{ J} (n)$ in Eq. (21) with analytical expressions (details presented in Appendix A) such that

\begin{matrix} | \underset{bias}{\underset{⏟}{E_{I^{n} \sim f_{x_{c}}} {τ_{J} (I^{n})} - x_{c}}} | \leq ϵ_{J} (n) \end{matrix}

$|\underbrace{{{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\{{{\tau }_{J}}({{I}^{n}})\}-{{x}_{\text{c}}}}_{\text{bias}}|\le {{\epsilon }_{J}}(n)$ (24)

and $\begin{matrix} {Var}_{I^{n} \sim f_{x_{c}}} {τ_{J} (I^{n})} \in (σ_{J}^{2} (n) - β_{J} (n), σ_{J}^{2} (n) + β_{J} (n)) . \end{matrix}$ $\begin{aligned} \mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _{ J}(I^n)\} \in \left( \sigma ^2_{ J}(n) - \beta _{ J} (n) , \sigma ^2_{ J}(n) + \beta _{ J} (n) \right). \end{aligned}$ (25)

The proof of this result and the expression for $(ϵ_{J} (n), σ_{J}^{2} (n), β_{J} (n))$ $(\epsilon _{ J} (n), \sigma ^2_\mathrm{J} (n), \beta _{ J} (n))$ in Eqs. (24) and (25) are presented in detail in Appendix A.

Revisiting the equality in Eq. (21), Theorem 1 provides general sufficient conditions to bound the residual term γ_J (n) and by doing that, a way of bounding the variance of τ_J (I) which is the solution of Eq. (14). In particular, it is worth noting that if the ratio $\frac{β_{J} (n)}{σ_{J}^{2} (n)} ≪$ $\frac{\beta _{ J} (n)}{\sigma ^2_{ J}(n)} \ll 1$ , then Eq. (25) offers a tight bound for ${Var}_{I^{n} \sim f_{x_{c}}} {τ_{J} (I^{n})}$ $\mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _{ J}(I^n)\}$ . In this last context, $σ_{J}^{2} (n)$ $\sigma ^2_{ J}(n)$ (called the nominal value of the result) provides a very good approximation for ${Var}_{I^{n} \sim f_{x_{c}}} {τ_{J} (I^{n})}$ $\mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _{ J}(I^n)\}$ .

On the application of this result to the WLS and ML estimators, we will see that the main assumption in Eq. (22) is satisfied in both cases (see Eqs. (B.8) and (C.6)), and from that $σ_{J}^{2} (n)$ $\sigma ^2_{ J}(n)$ plays an important role to approximate the performance of ML and WLS in a wide range of observational regimes. In addition, the analysis of the bias in Eq. (24) shows that these estimators are unbiased for any practical purpose and, consequently, contrasting their performance (estimation error $~ \sqrt{variance}$ $\tilde{\ }\sqrt{\text{ variance}}$ ) with the CR bound is a meaningful way to evaluate optimality.

5. Application to astrometry

In this section we apply Theorem 1 to bound the variances of the ML and WLS estimators in the context of astrometry. Following the model presented in Sect. 3.1, Iⁿ = (I₁,...,I_n)^T denotes the measurements acquired by each pixel of the array and where each of them follows a Poisson distribution given by $\begin{matrix} I_{i} \sim Poisson (λ_{i} (x_{c})), i = 1, \dots, n, \end{matrix}$ $\begin{aligned} I_\mathrm{i}\sim \mathrm{Poisson} (\lambda _\mathrm{i}(x_\mathrm{c})),~~ i=1,\ldots , n, \end{aligned}$ (26)

as expressed by Eqs. (2) and (3).

5.1. Bounding the variance of the WLS estimator

The WLS estimator, denoted by τ_WLS(Iⁿ) in Eq. (28), is implicitly defined through a cost function given by $\begin{matrix} J_{WLS} (α, I^{n}) = \sum_{i = 1}^{n} w_{i} {(I_{i} - λ_{i} (α))}^{2}, \end{matrix}$ $\begin{aligned} J_\mathrm{WLS}(\alpha ,I^n)=\sum _{i=1}^nw_\mathrm{i}(I_\mathrm{i}-\lambda _\mathrm{i}(\alpha ))^2, \end{aligned}$ (27)

where ${(w_{1}, \dots, w_{n})}^{T} \in R_{+}^{n}$ $(w_1,\ldots ,w_n)^T~\in \mathbb R ^n_+$ is a weight vector, and α is a general source position parameter. Specifically we have that $\begin{matrix} τ_{WLS} (I^{n}) = arg min_{α \in R} J_{WLS} (α, I^{n}) . \end{matrix}$ $\begin{aligned} \tau _\mathrm{WLS}(I^n) = \arg \min _{\alpha \in \mathbb R }J_\mathrm{WLS}(\alpha ,I^n). \end{aligned}$ (28)

Applying Theorem 1 we obtain the following result:

Theorem 2

If we consider the WLS estimator solution of Eq. (28), then we have that $\begin{matrix} | \underset{bias}{\underset{⏟}{E_{I^{n} \sim f_{x_{c}}} {τ_{WLS} (I^{n})} - x_{c}}} | \leq ϵ_{WLS} (n) \end{matrix}$ $\underbrace{|{{\mathbb{E}}_{{{l}^{n}}\tilde{\ }{{f}_{xc}}}}\{{{\tau }_{\text{WLS}}}({{l}^{n}})\}-{{x}_{c}}|}_{\text{bias}}\le {{\epsilon }_{\text{WLS}}}(n)$ (29)

and $\begin{matrix} {Var}_{I^{n} \sim f_{x_{c}}} {τ_{WLS} (I^{n})} \in (σ_{WLS}^{2} (n) - β_{WLS} (n), σ_{WLS}^{2} (n) + β_{WLS} (n)), \end{matrix}$ $\begin{aligned} \mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{WLS}(I^n)\} \in \left( \sigma ^2_\mathrm{WLS}(n) - \beta _\mathrm{WLS} (n) , \sigma ^2_\mathrm{WLS}(n) + \beta _\mathrm{WLS} (n) \right), \end{aligned}$ (30)

where $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ is given by $\begin{matrix} σ_{WLS}^{2} (n) & = \frac{\sum_{i = 1}^{n} w_{i}^{2} λ_{i} (x_{c}) {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}}{{(\sum_{i = 1}^{n} w_{i} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}})}^{2}} \end{matrix}$ $\begin{aligned} \sigma ^2_\mathrm{WLS}(n)&= \frac{\sum _{i=1}^nw_\mathrm{i}^2\lambda _\mathrm{i}(x_\mathrm{c})\left.\left(\frac{\partial \lambda _\mathrm{i}(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}}{\left(\sum _{i=1}^nw_\mathrm{i}\left.\left(\frac{\partial \lambda _\mathrm{i}(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}\right)^2} \end{aligned}$ (31)

and β_WLS(n) and ϵ_WLS(n) are well defined analytical expressions of the problem (presented in Appendix B).

The proof of this result and the expressions for ϵ_WLS(n) and β_WLS(n) in Eqs. (29) and (30), respectively, are elaborated in Appendix B. This result offers concrete expressions to bound the bias as well as the variance of the WLS estimator. For the bias bound in Eq. (29), it will be shown that ϵ_WLS(n) is very small (orders of magnitude smaller than x_c) for all the observational regimes explored in this work and, consequently, the WLS can be considered an unbiased estimator in astrometry, as would be expected. Concerning the bounds for the variance in Eq. (30), we will show that for high and moderate S/N regimes the ratio $β_{WLS} (n) / σ_{WLS}^{2} (n) ≪ 1$ $\beta _\mathrm{WLS}(n)/\sigma ^2_\mathrm{WLS}(n) \ll 1$ , and consequently in this context $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ is a precise estimator of ${Var}_{I^{n} \sim f_{x_{c}}} {τ_{WLS} (I^{n})}$ $\mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{WLS}(I^n)\}$ . For the very small S/N the results offers an admissible interval $σ_{WLS}^{2} (n) \pm β_{W L S} (n)$ $\sigma ^2_\mathrm{WLS}(n) \pm {{\beta }_{WLS}}(n)$ around the nominal value $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ . Therefore in any context $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ shows to be a meaningful approximation for the performance of the WLS.

Remark 2

If we focus on the analysis of the closed form expression $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ as an approximation of ${Var}_{I^{n} \sim f_{x_{c}}} {τ_{WLS} (I^{n})}$ $\mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{WLS}(I^n)\}$ and we compare it with the CR bound $σ_{CR}^{2} (n)$ $\sigma ^2_\mathrm{CR}(n)$ in Eqs. (10) and (11), we note that they are very similar in their structure. In particular, it follows that $σ_{WLS}^{2} (n) = σ_{CR}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)=\sigma ^2_\mathrm{CR}(n)$ , if and only if the weights of the WLS estimator are selected in the following way: $\begin{matrix} w_{i} = K \cdot \frac{1}{λ_{i} (x_{c})}, \forall i \in {1, \dots, n}, \end{matrix}$ $\begin{aligned} w_\mathrm{i}=K \cdot \frac{1}{\lambda _\mathrm{i}(x_\mathrm{c})},~~\forall i\in \{1,\ldots ,n\}, \end{aligned}$ (32)

where K is an arbitrary constant (K > 0). In other words, the only way in which the performance of the WLS approximates the CR limit is if we select the weights as in Eq. (32). However, this selection uses the information of the true position x_c, which is unfeasible as it contradicts the very essence of the inference task (indeed, x_c is unknown, and we are trying to estimate it from the data). Another interpretation is that no matter how we choose the weights of the WLS estimator, it is not possible that the WLS is close to the CR bound for every position x_c, telling us that the WLS is intrinsically not optimal from the perspective of being close to the CR limit in all the possible astrometric scenarios. In particular, this impossibility result is very strong in the high S/N regimes where $σ_{WLS}^{2} (n) \approx {Var}_{I^{n} \sim f_{x_{c}}} {τ_{WLS} (I^{n})}$ $\sigma ^2_\mathrm{WLS}(n)\approx \mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{WLS}(I^n)\}$ . This implication is consistent with the analysis presented by Lobos et al. (2015), Fig. 4), where it was shown that the variance of the LS estimator is significantly higher than the CR bound in the high S/N regime. This justifies the study of the ML estimator.

5.2. Bounding the variance of the ML estimator

The ML estimator, denoted by τ_WLS(Iⁿ) in Eq. (34), is implicitly defined through a cost function $\begin{matrix} J (α, I^{n}) = \sum_{i = 1}^{n} I_{i} ln (λ_{i} (α)) - λ_{i} (α), \end{matrix}$ $\begin{aligned} J(\alpha ,I^n)=\sum _{i=1}^nI_\mathrm{i} \ln (\lambda _\mathrm{i}(\alpha ))-\lambda _\mathrm{i}(\alpha ), \end{aligned}$ (33)

where α is a general source position parameter. Specifically, given an observation Iⁿ we have that $\begin{matrix} τ_{ML} (I^{n}) & = arg max_{α \in R} J (α, I^{n}), \\ = arg min_{α \in R} \sum_{i = 1}^{n} - I_{i} ln (λ_{i} (α)) + λ_{i} (α) . \end{matrix}$ $\begin{aligned} \tau _\mathrm{ML}(I^n)&= \arg \max _{\alpha \in \mathbb R }J(\alpha ,I^n),\nonumber \\ &= \arg \min _{\alpha \in \mathbb R }\sum _{i=1}^n-I_\mathrm{i} \ln (\lambda _\mathrm{i}(\alpha ))+\lambda _\mathrm{i}(\alpha ). \end{aligned}$ (34)

Applying Theorem 1 we obtain the following result:

Theorem 3

If we consider the ML estimator solution of Eq. (34), then we have that $\begin{matrix} | \underset{bias}{\underset{⏟}{E_{I^{n} \sim f_{x_{c}}} {τ_{ML} (I^{n})} - x_{c}}} | \leq ϵ_{ML} (n) \end{matrix}$ $|\underbrace{{{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\{{{\tau }_{\text{ML}}}({{I}^{n}})\}-{{x}_{\text{c}}}}_{\text{bias}}|\le {{\epsilon }_{\text{ML}}}(n)$ (35)

and $\begin{matrix} {Var}_{I^{n} \sim f_{x_{c}}} {τ_{ML} (I^{n})} \in (σ_{ML}^{2} (n) - β_{ML} (n), σ_{ML}^{2} (n) + β_{ML} (n)), \end{matrix}$ $\begin{aligned} \mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{ML}(I^n)\} \in \left( \sigma ^2_\mathrm{ML}(n) - \beta _\mathrm{ML} (n) , \sigma ^2_\mathrm{ML}(n) + \beta _\mathrm{ML} (n) \right), \end{aligned}$ (36)

where $\begin{matrix} σ_{ML}^{2} (n) = σ_{CR}^{2} (n) = {(\sum_{i = 1}^{n} \frac{{(\tilde{F} \frac{d g_{i} (x_{c})}{d x_{c}})}^{2}}{\tilde{F} g_{i} (x_{c}) + {\tilde{B}}_{i}})}^{- 1}, \end{matrix}$ $\begin{aligned} \sigma ^2_\mathrm{ML}(n) = \sigma ^2_\mathrm{CR}(n)= \left( \sum _{i=1}^n \frac{ \left( \tilde{F}\frac{\mathrm{d} g_\mathrm{i}(x_\mathrm{c})}{\mathrm{d} x_\mathrm{c}} \right)^2 }{\tilde{F} g_\mathrm{i}(x_\mathrm{c}) + \tilde{B}_\mathrm{i}} \right)^{-1}, \end{aligned}$ (37)

and β_ML(n) and ϵ_ML(n) are well defined analytical expression of the problem (presented in Appendix C).

The proof of this result and the expressions for ϵ_ML(n) and β_ML(n) in Eqs. (35) and (36), respectively, are elaborated in Appendix C.

Remark 3

It is important to mention that the magnitude of ϵ_ML(n) is orders of magnitude smaller than x_c in all the observational regimes studied in this work (see this analysis in Sect. 6) and, consequently, for any practical purpose the ML is an unbiased estimator. This implies that the comparison with the CR bound is a meaningful indicator when evaluating the optimality of the ML estimator.

Remark 4

We observe that if the ratio $β_{ML} (n) / σ_{ML}^{2} (n)$ $\beta _\mathrm{ML} (n) /\sigma ^2_\mathrm{ML}(n)$ is significantly smaller than one, which is shown in Sect. 6 from medium to high S/N regimes, then ${Var}_{I^{n} \sim f_{x_{c}}} {τ_{ML} (I^{n})} \approx σ_{ML}^{2} (n)$ $\mathrm{Var}_{I^n \sim f_{x_\mathrm{c}}}\{\tau _\mathrm{ML}(I^n)\} \approx \sigma ^2_\mathrm{ML}(n)$ . This is a very interesting result because we can approximate the performance of the ML estimator with $σ_{ML}^{2} (n)$ $\sigma ^2_\mathrm{ML}(n)$ . In this context, it is remarkable to have that the nominal value $σ_{ML}^{2} (n)$ $\sigma ^2_\mathrm{ML}(n)$ is precisely the CR bound (see Eq. (37)), because this means that the ML estimator closely approximates this MVB in the interesting regime from moderate to very high S/N. This medium-high S/N regime is precisely the context where the LS estimator shows significant deficiencies as presented in Lobos et al. (2015). Therefore, ML offers optimal performances in the regime where LS-type methods are not able to match the CR bound, which satisfactorily resolves the question posed by Lobos et al. (2015) on the study of schemes that could closely approach the CR bound in the high S/N regime.

6. Numerical analysis

In this section we evaluate numerically the performance bounds obtained in Sect. 5 for the WLS and ML estimators, and compare them with the astrometric CR bound in Proposition 1. The idea is to consider some realistic astrometric conditions to evaluate the expressions developed in Theorems 2 and 3 and their dependency on important observational conditions and regimes. As we shall see, key variables in this analysis are the trade-off between the intensity of the object and the noise represented by the S/N value, and the pixel resolution of the CCD.

6.1. Experimental setting

We adopt some realistic design and observing variables to model the problem (Mendez et al. 2013, 2014). For the PSF, analytical and semi-empirical forms have been introduced (see for instance the ground-based model in King 1971 and the space-based models by King 1983 or Bendinelli et al. 1987). In this work we will adopt a Gaussian PSF, that is, $ϕ (x, σ) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{{(x)}^{2}}{2 σ^{2}}}$ $\phi (x,\sigma )= \frac{1}{\sqrt{2\pi } \sigma } \mathrm{e}^{- \frac{(x)^2}{2 \sigma ^2}}$ in Eq. (3), and where σ is the width of the PSF and is assumed to be known. This PSF has been found to be a good representation for typical astrometric-quality ground-based data (Méndez et al. 2010). In terms of nomenclature, $F W H M \equiv 2 \sqrt{2 ln 2} σ$ $FWHM \equiv 2 \sqrt{2 \ln 2} \,\, \sigma$ measured in arcsec, denotes the full-width at half-maximum parameter, which is an overall indicator of the image quality at the observing site (Chromey, 2016).

The background profile, represented by {B̃_i, i = 1,…,n} in Eq. (2), is a function of several variables, like the wavelength of the observations, the moon phase (which contributes significantly to the diffuse sky background), the quality of the observing site, and the specifications of the instrument itself. We will consider a uniform background across pixels underneath the PSF, that is, B̃_i = B̃ for all i. To characterize the magnitude of B̃, it is important to first mention that the detector does not measure photon counts [e⁻] directly, but a discrete variable in “Analog to Digital Units (ADUs)” of the instrument, which is a linear proportion of the photon counts (McLean 2008). This linear proportion is characterized by the gain of the instrument G (scaling value) in units of [e⁻/ADU]. We can define F ≡ F̃/G and B ≡ B̃/G as the intensity of the object and noise, respectively, in the specific ADUs of the instrument. Then, the background (in ADUs) depends on the pixel size Δx arcsec as follows: $\begin{matrix} B = f_{s} Δ x + \frac{D + {RON}^{2}}{G} [ADU], \end{matrix}$ $\begin{aligned} B=f_\mathrm{s}\Delta x+ \frac{D+\mathrm{RON}^2}{G}~ [\mathrm{ADU}], \end{aligned}$ (38)

where f_s is the (diffuse) sky background in ADU arcsec⁻¹, while D and RON², both measured in e⁻, model the dark-current and read-out-noise of the detector on each pixel, respectively. The first component in Eq. (38) is attributed to the site, and its effect is proportional to the pixel size. On the other hand, the second component is attributed to errors of the PID (detector), and it is pixel-size independent. This distinction is central when analyzing the performance as a function of the pixel resolution of the array (see details in Mendez et al. 2013, Sect. 4). More important is the fact that in typical ground-based astronomical observation, long exposure times are considered, which implies that the background is dominated by diffuse light coming from the sky, and not from the detector Mendez et al. (2013), Sect. 4).

For the experimental conditions, we consider the scenario of a ground-based station located at a good site with clear atmospheric conditions and the specification of current science-grade CCDs, where f_s = 1502.5 ADU arcsec⁻¹, D = 0, RON = 5 e⁻ arcsec (equivalent to $σ = 1 / (2 \sqrt{2 ln 2})$ $\sigma = 1/(2\sqrt{2\ln 2})$ arcsec), and G = 2, e⁻ ADUe⁻¹ (with these values we will have B = 313 ADU for Δx = 0.2 arcsec using Eq. (38)). In terms of scenarios of analysis, we explore different pixel resolutions for the CCD array Δx ∈ [0.1,0.65] measured in arcsec, and different signal strengths F̃ ∈ {1080; 3224; 20 004; 60 160}, measured in e⁻, which corresponds to S/N ∈ ∼ {12,32,120,230}. Increasing F̃ implies increasing the S/N of the problem, which can be approximately measured by the ratio F̃/B̃. On a given detector plus telescope setting, these different S/N scenarios can be obtained by appropriately changing the exposure time (open shutter) that generates the image.

Fig. 1.

Relative performance of the bias (as measured by $log (100 \times \frac{ϵ_{J} (n)}{x_{c}})$ $\log \left( 100 \times \frac{\epsilon _{ J}(n)}{x_\mathrm{c}} \right)$ ) stipulated by Theorem 1 for the WLS estimator (left side, Eq. (29)) and the ML estimator (right side, Eq. (35)). Results are reported for different values of the source flux F̃ ∈ {1080, 3224, 20 004, 60, 160}, all in e⁻ (top to bottom symbols respectively), as a function of the detector pixel size. The 0% level corresponds to having achieved no bias.

6.2. Bias analysis

Considering the upper bound terms ϵ_WLS(n) and ϵ_ML(n) for the bias error obtained from Theorems 2 and 3 for the WLS and ML, respectively, Fig. 1 presents the relative bias error for different S/N regimes and pixel resolutions. In the case of the ML estimator, the bounds for relative bias error are very small in all the explored S/N regimes and pixel resolutions meaning that for any practical purposes this estimator is unbiased as expected from theory (Gray & Davisson, 2004). For the case of the WLS, we observe that from medium to high S/N the relative error bound obtained is very small, meanwhile at low S/N unbiasedness cannot be fully guaranteed from the bound in Eq. (29). In general, our results show that both WLS and ML are unbiased estimators for astrometry in a wide range of relevant observational regimes (in particular from medium to high S/N) and, consequently, it is meaningful to analyze the optimality of these estimators in comparison with the CR bound in those regimes.

In the following sections, we move to the analysis of the variance of the WLS and ML with particular focus on the medium to high S/N regimes across all pixel resolutions using the performance bounds derived in Eqs. (30) and (36), respectively.

6.3. Performance analysis of the WLS estimator

In this section, we evaluate numerically the expression derived in Theorem 2 to bound the variance of the WLS estimator in Eq. (30). For that we characterize the admissible regime predicted for the variance of the WLS estimator, that is, the interval $\begin{matrix} (σ_{WLS}^{2} (n) - β_{WLS} (n), σ_{WLS}^{2} (n) + β_{WLS} (n)), \end{matrix}$ $\begin{aligned} \left( \sigma ^2_\mathrm{WLS}(n) - \beta _\mathrm{WLS} (n) , \sigma ^2_\mathrm{WLS}(n) + \beta _\mathrm{WLS} (n) \right), \end{aligned}$ (39)

for S/N ∈ ∼ {12,32,120,230} and Δx ∈ [0.01,0.65] arcsec. In these bounds, we recognize its central value (or nominal value) $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ in Eq. (31) and the length of the interval 2β_WLS(n) that is determined in closed form for its numerical evaluation in Eqs. (A.2) and (B.11). We note that 2β_WLS(n) can be considered an indicator of the precision of our result to approximate the variance of the WLS in astrometry.

Fig. 2.

Range of the square root of the variance performance (in miliarcsecond = mas) for the WLS method in astrometry using uniform weights (equivalent to the LS method) predicted by Theorems 1 and 2, Eq. (30). Results are reported for different representative values of F̃ and across different pixel sizes (top-left to bottom-right): F̃ ∈ {1080; 3224; 20 004; 60 160} e⁻.

6.3.1. Revisiting the uniform weight case

To begin the analysis, we consider the setting of uniform weights across pixels, that is, the case of the LS estimator and, without loss of generality, we locate the object in the center of the field of view¹¹, which can be considered a reasonable scenario to represent the complexity of the astrometry task. At this point, it is important to remind the reader that from the analysis of $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ in Remark 2, the nominal value $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ is equivalent to the CR bound when the w_i are selected as a function of the true position in Eq. (32). In view of this observation, the selection of non-uniform weights can be interpreted as biasing the estimation to a particular area of the angular space, which goes in contradiction to the essence of the inference problem that estimates the position with no prior information, and only relies on the measured counts. From this interpretation, revisiting the LS estimator is an important first step in the analysis of the WLS framework.

On the specifics, the boundaries of the interval in Eq. (39) and its nominal values are illustrated in Fig. 2 for the different observational regimes. In addition, Fig. 2 shows the CR bound as a reference to evaluate the optimality of the LS scheme across settings. We observe that for the low S/N ∼ 12 regime, the nominal values precisely match the CR bound, however, our result is not conclusive as the interval around the nominal performance is significantly large. This is the regime where our result is not conclusive regarding the performance of the LS estimator. In the regime S/N ∈ (30,50) (top right panel on Fig. 2), we notice an important reduction in the range of admissible performance, and our result becomes more informative and meaningful. In this context, the nominal values are very close to the CR bound, and we could assert that the LS estimator offers sufficiently good performance in the sense that it is very competitive with the MVB. Importantly, when we move to the regime of relatively high S/N and very high S/N (from 100 to 300), our results are very accurate in predicting the performance of the LS method, and we find that the gap between the CR and the nominal value is very significant (the deviation from the MVB is 16% and 30% for S/N 120 and 230 at Δx = 0.2 arcsec, respectively). This last result confirms one of the main findings presented in Lobos et al. (2015), who showed that for medium to high S/N the LS estimator is suboptimal with respect to the MVB. In Fig. 2 we also show the square root of the empirical variance ( $Var (\hat{τ_{WLS}})$ $\mathrm{Var}(\hat{\tau _\mathrm{WLS}})$ ) with respect to the empirically-determined mean position x̂_c (using the WLS estimator), all as derived from the simulations, showing good consistency with our predictions.

Fig. 3.

Worst-case discrepancies in Eq. (40) for the WLS estimator using the weights set indexed by the positions $Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $\Theta = \{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ . Results are reported for two S/N scenarios, namely F̃ = 20 004 e⁻ (left) and F̃ = 60 160 e⁻ (right) (right), and across different pixel sizes.

6.3.2. Non-uniform weight case

The sub-optimality of the WLS scheme from moderate to high S/N seen in Fig. 2 can be extended for any arbitrary selection of a fixed set of weights, as would be expected. Given that the space of weights’ selection is literally unlimited, we use the insight obtained from Remark 2 that states that a selection of weights can be interpreted as an specific prior on the position of the object where the optimum, but unfortunately unknown, selection (achieving the CR bound) of weights in Eq. (32) is an explicit function of the unknown position of the object. Then, we consider a finite set of positions {x_c,1,…,x_c,M} that uniformly partition the field of view, and their corresponding weight sets {w_i(x_c,1) : 1 = 1,…,n}, {w_i(x_c,1) : 1 = 1,…,n}, {w_i(x_c,M) : 1 = 1,…,n} using Eq. (32) to cover a representative collections of weights for the problem of astrometry.

Then, for a specific selection of weights in our admissible collection (attributed to a prior belief regarding the position of the object in the field of view), we evaluate the worse discrepancy between $σ_{WLS}^{2} (n) - β_{WLS} (n)$ $\sigma ^2_\mathrm{WLS}(n) - \beta _\mathrm{WLS} (n)$ (which is the most favorable expression for the variance predicted from Theorem 2), and the CR bound in Proposition 1, across a collection of presumed object positions in the range of positions $Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}},$ $\Theta = \{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \},$

where $x_{o}^{*}$ $x_\mathrm{o}^{*}$ denotes the center of the array (which, as indicated at the beginning of Sect. 6.3.1, is equal to the true object position x_c) and $σ = F W H M / 2 \sqrt{2 ln 2}$ $\sigma =FWHM/2\sqrt{2\ln 2}$ is the dispersion parameter of the PSF. The idea of using this worst-case difference is justified from the fact that in this parameter estimation problem we do not know the position of the object and, consequently, the optimality of any WLS estimator should be evaluated in the worse-case situation, as the scheme should be able to estimate the position of the object in any scenario (position). More precisely, for a given weight selection {w_i, i=1,…,n}, we use the worst-case discrepancy $\begin{matrix} sup_{x_{c} \in Θ} \frac{(σ_{WLS}^{2} (n) - β_{WLS} (n)) - σ_{CR}^{2} (n)}{σ_{CR}^{2} (n)} \cdot \end{matrix}$ $\begin{aligned} \sup _{x_\mathrm{c} \in \Theta } \frac{(\sigma ^2_\mathrm{WLS}(n) - \beta _\mathrm{WLS} (n)) - \sigma ^2_\mathrm{CR}(n)}{\sigma ^2_\mathrm{CR}(n)}\cdot \end{aligned}$ (40)

For this analysis, we note that both $(σ_{WLS}^{2} (n) - β_{WLS} (n))$ $(\sigma ^2_\mathrm{WLS}(n) - \beta _\mathrm{WLS} (n))$ and $σ_{CR}^{2} (n)$ $\sigma ^2_\mathrm{CR}(n)$ are functions of the position x_c, Δx, and S/N.

Figure 3 illustrates the worst-case discrepancy in Eq. (40) for the medium and high S/N regimes where Theorem 2 provides an accurate and meaningful prediction of the performance of the WLS method, that is, for S/N ∈ {120,230}, and across Δx ∈ [0.05,0.7] arcsec. The discrepancy is quite significant, on the order of 37% and 60% in the range for Δx ∈ [0.1,0.3] arcsec for S/N 120 and 230, respectively.

To refine the worst-case analysis presented in Fig. 3, and to evaluate in more detail the sensitivity of the discrepancy indicator given by Eq. (40), we evaluate the discrepancy of WLS using the weights associated to $x_{o}^{*}$ $x_\mathrm{o}^{*}$ (the center position of the array) with respect to the CR bound associated to the positions ${x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $\{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ , to study how the discrepancy (measuring the non-optimality of the method) evolves when the adopted position moves far from the prior imposed by the WLS in the center of the array. Figure 4 illustrates this behavior, where it is possible to see that the discrepancy is very sensitive and proportional to the misassumption of the object position, where the worst-case discrepancy in the maximum achievable location precision is on the order of 40% for pixel sizes in the range [0.1, 0.6] arcsec for S / N ~ 120, and about 60% for pixel sizes in the range [0.1, 0.6] arcsec for S / N ~ 230. These worst-case scenarios happens in both cases when the object is located the farthest from the prior assumption, as would be expected.

The main conclusion derived form this CR bound analysis is that, independent of the weight selection adopted, as long as the weights are fixed, the WLS estimator is not able to achieve the CR bound in all observational regimes. More precisely, the discrepancy (measuring the non-optimality) in the less favorable case of an hypothetical and feasible position of the object is very significant, in the range of 40%–60% for the important regime of high and very high S/N.

Fig. 4.

Performance discrepancies (measuring the non-optimality) of the WLS estimator using the center position as a prior for the weight selection with respect to the CR bound obtained for the true object positions ${x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $\{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ . Results are reported for two S/N scenarios, namely F̃ = 20 004 e⁻ (left) and F̃ = 60 160 e⁻ (right), and across different pixel sizes.

6.4. Performance analysis of the ML estimator

In this section we perform the same analysis done for the WLS in Sect. 6.3, but using the result in Theorem 3. In particular, we consider the admissible regime for the variance of the ML estimator given by $\begin{matrix} (σ_{ML}^{2} (n) - β_{ML} (n), σ_{ML}^{2} (n) + β_{ML} (n)) \end{matrix}$ $\begin{aligned} \left( \sigma ^2_\mathrm{ML}(n) - \beta _\mathrm{ML} (n) , \sigma ^2_\mathrm{ML}(n) + \beta _\mathrm{ML} (n) \right) \end{aligned}$

in Eq. (36), where the nominal value in this case, $σ_{ML}^{2} (n)$ $\sigma ^2_\mathrm{ML}(n)$ in Eq. (37), is precisely the CR bound, while the length of the interval 2β_ML(n), is given by Eqs. (A.2) and (C.9) in closed form.

Considering an object located in the center of the array, that is, x_c = x₀ the performance curves are presented in Fig. 5 for S/N ∈ {12, 32, 120, 230} and Δx ∈ [0.1, 0.65] arcsec. First, we note that there is a significant difference in the predictions of our methodology for the ML estimator in comparison with what we predict in the WLS case. In fact, the results of our approach are very precise for the determination of the variance of the ML estimator in all the regimes, from small to high S/N, which is remarkable. More important it is the fact that, from these findings, we observe that the performance deviation from the MVB in the worst case (small S/N) is very small (see Table 1, first row), while for any practical purposes the variance of the ML estimator achieves the CR limit for all the other regimes, from medium to high S/N, which is a numerical corroboration of the optimality of the ML estimator in astrometry, as predicted theoretically by Theorems 1 and 3. In Fig. 5 we also show the square root of the empirical variance (Var(τM̂L)) with respect to the empirically-determined mean position x̂_c (using the ML estimator), all as derived from the simulations, showing good consistency with our predictions.

Complementing this analysis, we conducted the same comparison considering different positions for the object within the array obtaining the same trends and conclusions. To summarize these results, Fig. 6 shows the value $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{ML}(n)+\beta _\mathrm{ML} (n)}-\sigma _\mathrm{ML}(n)}{\sigma _\mathrm{ML}(n)}$ , which is an indicator of the quality of the estimator (the smaller the better) for different scenarios of the position of the object $x_{c} \in Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $x_\mathrm{c}\in \Theta = \{x_\mathrm{o}^{*}\,{-}\,\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ . In particular, for all the evaluated positions, the relative discrepancy is bounded (relative to the CR bound) by values in the range of 0.025% and 0.012% for pixel resolution in the range Δx ∈ [0.1, 0.2] arcsec for S/N = 120 and S/N = 230, respectively. Finally, Table 1 presents the relative discrepancy for all the range of S/N values considered in this study for the case Δx = 0.2 arcsec.

We conclude from this analysis that the ML estimator in nearly optimal for the medium, high, and very high S/N regimes across pixel resolutions, achieving the MVB for astrometry. This is an interesting result, since it lends further support to the adoption of these types of estimators for very demanding astrometric applications, as has been done in the case of Gaia (Lindegren, 2008). We note that Vakili & Hogg (2016) reach the same conclusion regarding the optimality of the ML method in comparison with the MVB, through simulations of two-dimensional (2D) CCD images using a broad set of Moffat PSF stellar profiles. While their results are purely empirical, it is interesting that they test the ML using a different PSF from ours, and in a 2D scenario, and yet they reach the same conclusions. More recently, Gai et al. (2017) have tested (also empirically) the ML method in a 1D scenario (similar to ours), but in the context of a Gaia-like PSF. They find that the ML is unbiased (in agreement with our results, see Fig. 1) and by comparing two implementations of the ML they conclude that they predict self-consistent and reliable results over a broad range of flux, background, and instrument response variations. It would still be interesting to compare the performance of those implementations against the CR MVB in order to further test our theoretical predictions.

Fig. 5.

Range of the square root of the variance performance (in miliarcsecond = mas) for the ML method in astrometry as predicted by Theorems 1 and 3, Eq. (36). Results are reported for different representative values of F̃ and across different pixel sizes (top-left to bottom-right): F̃ ∈ {1080, 3224, 20 004, 60 160} e^‒.

Table 1.

Performance quality of the ML estimator relative to the Cramér–Rao bound expressed in terms of the indicator $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{{ML}}(n)+\beta _\mathrm{{ML}} (n)}-\sigma _\mathrm{{ML}}(n)}{\sigma _\mathrm{{ML}}(n)}$ from the result in Theorem 3.

6.5. Comments on an adaptive WLS estimator for astrometry

In Sect. 5.1 we presented results that offer a nominal prediction for the variance of the WLS method through Eq. (30), which turns out to be very accurate in the regime from medium to high S/N as shown in Sect. 6.3. Interestingly, Remark 2 tells us that these nominal values precisely match the CR limit for an optimal selection of weights given in Eq. (32), namely w_i ∼ 1/λ_i(x_c) for all i = 1, . . . , n (compare Eqs. (31) and (37)). As this selection is unfeasible, because it requires the knowledge of x_c (see the expression in Eq. (2)), we can approximate this value by a noisy version of it, considering the fact that the expected value of the observations I_i that we measure at pixel i is λ_i(x_c) using our model in Eq. (4). Therefore I_i can be interpreted as a noisy version of λ_i(x_c) and $\begin{matrix} {\hat{w}}_{i} (I_{i}) = \frac{1}{I_{i}} \end{matrix}$ $\begin{aligned} \hat{w}_\mathrm{i}(I_\mathrm{i})=\frac{1}{I_\mathrm{i}} \end{aligned}$ (41)

can be seen as a noisy version of the ideal weight $\frac{1}{λ_{i} (x_{c})}$ $\frac{1}{\lambda _\mathrm{i}(x_\mathrm{c})}$ . Adopting this data-driven weighting approach, we would have an adaptive WLS method as the weights are not fixed but instead they are a function of the data {I₁ : i = 1, . . . , n}. This selection of weights can be interpreted as an empirical version of the optimal weights that achieves the CR bound. Then, the problem reduces to solve $\begin{matrix} τ_{AWLS} (I^{n}) = arg min_{α \in R} J_{AWLS} (α, I^{n}), \end{matrix}$ $\begin{aligned} \tau _\mathrm{AWLS}(I^n) = \arg \min _{\alpha \in \mathbb R }J_\mathrm{AWLS}(\alpha ,I^n) , \end{aligned}$ (42)

where $\begin{matrix} J_{AWLS} (α, I^{n}) = \sum_{i = 1}^{n} {\hat{w}}_{i} (I_{i}) {(I_{i} - λ_{i} (α))}^{2} . \end{matrix}$ $\begin{aligned} J_\mathrm{AWLS}(\alpha ,I^n)=\sum _{i=1}^n \hat{w}_\mathrm{i}(I_\mathrm{i})(I_\mathrm{i}-\lambda _\mathrm{i}(\alpha ))^2. \end{aligned}$ (43)

Figure 7 presents the performance of this scheme for the same regimes we have been exploring in this work, supporting the conjecture that this selection of weights resembles the optimal weight selection and in fact achieves MSE performances that are surprisingly close to the CR bound in all the observational regimes.

Our results in this sub-section show that some commonly adopted weighting schemes, using the analogous idea to Eq. (41), are a very good data-driven choice for methods that employ a WLS scheme, such is the case, for example, of the well-know PSF stellar fitting program (including astrometry) Dominion Astrophysical Observatory Photometry (DAOPHOT), described in Stetson (1987, Eq. (10)).

Fig. 6.

Performance optimality of the ML estimator (computed as $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{ML}(n)+\beta _\mathrm{ML} (n)}-\sigma _\mathrm{ML}(n)}{\sigma _\mathrm{ML}(n)}$ ) for different positions of the target object $x_{c} \in Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $x_\mathrm{c}\in \Theta = \{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ in the array, as a function of pixel resolution. The left panel shows the case Fį = 20 004 e^‒, the right panel the case Fį = 60 160 e^‒.

Fig. 7.

Performance comparison between the $\sqrt{MSE}$ $\sqrt{{\rm{MSE}}}$ of the adaptive WLS estimator and σ_CR(n), both in mas. Results are reported for different F̂ and across different pixel sizes: (top-left to bottom-right) F̃ ∈ {1080, 3224, 60 160}e^‒.

6.6. Undersampled case

So far we have analyzed the performance of the ML and WLS estimators in regimes where the PSF (assumed to have a FWHM = 1 arcsec) is well sampled by the detector. In this section we analyze the undersampled case, where we have only a few pixels that capture most the flux of the source. This scenario would be typical of nearly diffraction-limited images taken, for example, from space, or with adaptive optics (AO) system from the ground.

Table 2 shows the performance of the bounds presented in Theorems 2 and 3 in the undersampled regime, where the PSF is concentrated in only a few pixels (order of five to nine pixels), and where the flux is relatively large. For the space-based case, the background is reduced almost to the pure instrument noise (left numbers in the table), while in the ground-based case the background would be larger, but we would still have a very large Strehl ratio. From the table it can be seen that the performance is nearly optimal with the ML estimator, even in the severely undersampled regime (small FWHM values). On the other hand, the WLS deteriorates its performance very quickly as the image size decreases, reaching up to nearly 45% when the FWHM is on the order of the detector pixel size.

We emphasize that the above results are computed in the ideal situation of a uniform pixel response function (and a perfect flat-field correction), whereas in reality a non-uniform pixel response has a critical impact on the Cramér–Rao bound as demonstrated by Adorf (1996), Fig. 1). In addition, in the case of stare-mode space-based images acquired under very small background, charge transfer inefficiency effects due to time-varying charge traps in the detector would also blur (and systematically shift) the images, deteriorating the ultimate astrometric precision above the Cramér–Rao limit (for details see, e.g., Bristow et al. 2006, especially their Fig. 10).

7. Summary, conclusions, and outlook

In this paper we study the performance of the WLS and ML estimators for relative astrometry on digital detectors subject to Poisson noise, in comparison with the best possible attainable precision given by the CR bound. Our study includes analytical results and numerical simulations under realistic observational conditions to help us to corroborate our theoretical findings.

By generalizing the proposal by Fessler (1996) we are able to obtain, for the first time, close-form expressions for the variance and the mean of implicit estimators (as is the particular case of the WLS and ML schemes), which can be computed directly from the data (see Theorem 1, in particular Eqs. (24) and (25), and Appendix A). When applying this result to astrometry with digital detectors, we are able to bound both the bias and the variance of the relative position of a celestial source on a CCD array as a function of all the relevant parameters of the problem (see Eqs. (29) and (30) or Eqs. (35) and (36) for the WLS and ML estimators, respectively). We verified that the bias of the WLS and ML methods are negligible in all the observational regimes explored in this paper (see Fig. 1).

A careful analysis of our predictions confirms earlier results by (Lobos et al. 2015; for the LS method) in that the WLS method is, in general, sub-optimal (in comparison with the MVB given by the CR result), especially at high and very high S/N (see the two bottom panels on Fig. 2). However, a judicious data-driven selection of weights (called “adaptive” WLS method by us, Sect. 6.5), improves the performance of the WLS substantially (see Fig. 7). This is an interesting result, given the widespread use and simple numerical implementation of the WLS method.

The ML method is found to have both a smaller bias than the WLS method (compare left and right panels of Fig. 1), although the bias on both methods is already quite small, and a tight correspondence to the MVB throughout the entire range of S/N regimes explored in this paper (Fig. 5). Therefore, the ML estimator for astrometry is consistently optimal, and should be the estimator of choice for high-precision applications. This paper, along with Lobos et al. (2015), completes an in-depth study of the performance of commonly used estimators in astrometry using PIDs, and sets the stage for the development of codes that could efficiently implement astrometric ML estimators on 2D detectors, incorporating also the simultaneous measurement of fluxes, as explored by Gai et al. (2017).

Table 2.

Performance quality of the ML and WLS estimators relative to the nominal bound expressed in terms of the indicator $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{ML}(n)+\beta _\mathrm{ML} (n)}-\sigma _\mathrm{ML}(n)}{\sigma _\mathrm{ML}(n)}$ and $100 \times \frac{\sqrt{σ_{WLS}^{2} (n) + β_{WLS} (n)} - σ_{WLS} (n)}{σ_{WLS} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{WLS}(n)+\beta _\mathrm{WLS} (n)}-\sigma _\mathrm{WLS}(n)}{\sigma _\mathrm{WLS}(n)}$ , respectively, in the undersampled regime (space-based and ground-based are on the left and right, respectively).

¹

See, e.g., Reffert (2009), Fig. 1a in Høg (2017) for an overview spanning more than 2000 years of astrometry, and Benedict et al. (2016) for a summary of the contributions from the Hubble Space Telescope (HST) (fine guide sensors), and, of course, the exquisite prospects from Prusti et al. (2016), with applications ranging from fundamental astrophysics (Van Altena, 2013; Cacciari et al. 2016), to cosmology (Lattanzi, 2012).

²

As the sample size increases, the sampling distribution of the estimator becomes increasingly concentrated at the true parameter value.

³

More precisely, the distribution around the true parameter approaches a normal distribution as the sample size grows.

⁴

The analysis can be extended to the two-dimensional case as presented in Mendez et al. (2013).

⁵

This captures the angular position in the sky and it is measured in seconds of arc (arcsec thereafter), through the “plate-scale”, which is an optical design feature of the instrument plus telescope configuration.

⁶

The joint estimation of photometry and astrometry is the task of estimating both $(x_{c}, \tilde{F})$ $(x_\mathrm{c},\tilde{F})$ from the observations (see Mendez et al. 2014).

⁷

In the sense that, for all $\bar{θ} \in Θ$ $\bar{\theta }\in \Theta$ , $E_{I^{n} \sim f_{\bar{θ}}^{n}} {τ_{n} (I^{n})} = \bar{θ}$ $\mathbb E _{I^n \sim f^n_{\bar{\theta }}} \left\{ \tau _n(I^n)\right\} =\bar{\theta }$ .

⁸

This is the solution of $τ_{LS} (I^{n}) = arg min_{α \in R} \sum_{i = 1}^{n} {(I_{i} - λ_{i} (α))}^{2}$ $\tau _\mathrm{LS}(I^n)= \arg \min _{\alpha \in \mathbb R } \sum _{i=1}^{n} \left( I_\mathrm{i} -\lambda _\mathrm{i}(\alpha ) \right)^2$ , with λ_i(α) = F̃g_iα + B̃_i, (α) being a generic variable representing the astrometric position, g_i(·) is given by Eq. (3), and where arg min represents the argument that minimizes the expression. More details are presented in Lobos et al. (2015).

⁹

In particular, for the very high S/N regime and assuming $Δ x / σ ≪ 1$ $\Delta x/\sigma \,\ll \,1$ , Lobos et al. (2015), Proposition 3) show that this gap reaches the condition $\frac{σ_{LS}^{2} (n)}{σ_{CR}^{2} (n)} \approx \frac{8}{3 \sqrt{3}} \cdot$ $\frac{\sigma ^2_\mathrm{LS}(n)}{\sigma _\mathrm{CR}^2(n)} \approx \frac{8}{3\sqrt{3}}\cdot$

¹⁰

It follows that $\lim_{I \to \bar{I}} \frac{e (\bar{I}, I - \bar{I})}{| | I - \bar{I} | |_{2}} = 0$ $\underset{I\to \bar{I}}{\mathop{\lim }}\,\frac{e(\bar{I},I-\bar{I})}{||I-\bar{I}|{{|}_{2}}}=0$ .

¹¹

It is important to remember that the CR bound is a function of the value of the parameter to be estimated, in this case the position x_c (see the Fisher information in Eq. (10)).

¹²

Considering that ${\bar{I}}_{i} = E {I_{i}} = λ_{i} (x_{c})$ $\bar{I}_i=E\{I_i\}=\lambda _i(x_\mathrm{c})$ .

¹³

The derivation of this identity is presented in Appendix B.2

¹⁴

Considering that ${\bar{I}}_{i} = E {I_{i}} = λ_{i} (x_{c})$ $\bar{I}_i =E\{I_i\}=\lambda _i(x_\mathrm{c})$ .

¹⁵

The derivation of this result is presented in Appendix C.2

Acknowledgments

SE acknowledges support from CONICYT-PCHA/MagisterNacional/2016 – 22160840. The authors want to thank the support of the Advanced Center for Electrical and Electronic Engineering, AC3E, Basal Project FB0008, PIA ACT1405, and the Chilean Centro de Excelencia en Astrofisica y Tecnologias Afines (CATA) BASAL PFB/06 from CONICYT. The authors also acknowledge support by CONICYT/FONDECYT grants # 1151213, 1170044, and 1170854. RAM also acknowledges support from the Project IC120009 Millennium Institute of Astrophysics (MAS) of the Iniciativa Cientifica Milenio del Ministerio de Economia, Fomento y Turismo de Chile. Finally, we thank the anonymous referee for his or her careful reading of our manuscript, and for pointing out the exploration of the relevant space-based undersampled regime that led to the discussion in Sect. 6.6.

References

Adorf, H.-M. 1996, in Astronomical Data Analysis Software and Systems V, 101, 13 [NASA ADS] [Google Scholar]
Alard, C., & Lupton, R. H. 1998, ApJ, 503, 325 [NASA ADS] [CrossRef] [Google Scholar]
Altmann, M., Bouquillon, S., & Taris, F. 2014, Proc. SPIE, 9149, 91490P [CrossRef] [Google Scholar]
Auer, L., & Van Altena, W. 1978, AJ, 83, 531 [NASA ADS] [CrossRef] [Google Scholar]
Bastian, U. 2004, GAIA Technical Note, 2004 BASNOCODE [Google Scholar]
Bendinelli, O., Parmeggiani, G., Piccioni, A., & Zavatti, F. 1987, AJ, 94, 1095 [NASA ADS] [CrossRef] [Google Scholar]
Benedict, G. F., McArthur, B. E., Nelan, E. P., & Harrison, T. E. 2016, PASP, 129, 012001 [Google Scholar]
Bouquillon, S., Mendez, R., Altmann, M., et al. 2017, A&A, 606, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bradley, R. A., & Gart, J. J. 1962, Biometrika, 49, 205 [CrossRef] [Google Scholar]
Bristow, P., Kerber, F., & Rosa, M. 2006, in The 2005 HST Calibration Workshop: Hubble After the Transition to Two-Gyro Mode, 299 [Google Scholar]
Cacciari, C., Pancino, E., & Bellazzini, M. 2016, Astron. Nachr., 337, 899 [NASA ADS] [CrossRef] [Google Scholar]
Chromey, F. R. 2016, To Measure the Sky: An Introduction to Observational Astronomy (Cambridge: Cambridge University Press) [Google Scholar]
Chun-Lin, L. 1993, IAU Symp., 156, 113 [NASA ADS] [Google Scholar]
Cover, T. M., & Thomas, J. A. 2012, Elements of Information Theory (New York: Wiley) [Google Scholar]
Cramér, H. 1946, Scand. Actuar. J., 1946, 85 [CrossRef] [Google Scholar]
Echeverria, A., Silva, J. F., Mendez, R. A., & Orchard, M. 2016, A&A, 594, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Fessler, J. A. 1996, IEEE Trans. Image Process., 5, 493 [NASA ADS] [CrossRef] [Google Scholar]
Gai, M., Busonero, D., & Cancelliere, R. 2017, PASP, 129, 054502 [NASA ADS] [CrossRef] [Google Scholar]
Gray, R. M., & Davisson, L. D. 2004, An Introduction to Statistical Signal Processing (Cambridge: Cambridge University Press) [CrossRef] [Google Scholar]
Hoadley, B. 1971, Ann. Math. Stat., 1977 [CrossRef] [Google Scholar]
Høg, E. 2017, ArXiv e-prints [arXiv:1707.01020] [Google Scholar]
Howell, S. B. 2006, Handbook of CCD Astronomy (Cambridge: Cambridge University Press), 5 [Google Scholar]
Jakobsen, P., Greenfield, P., & Jedrzejewski, R. 1992, A&A, 253, 329 [NASA ADS] [Google Scholar]
Janesick, J. R. 2001, Scientific Charge-Coupled Devices (Bellingham: SPIE Press), 83 [Google Scholar]
Janesick, J. R. 2007, Photon Transfer (San Jose: SPIE Press) [CrossRef] [Google Scholar]
Kay, S. M. 1993, Fundamentals of Statistical Signal Processing. Vol 1, Estimation Theory (Englewood Cliffs: Prentice-Hall) [Google Scholar]
Kendall, M., Stuart, A., Ord, J., & Arnold, S. 1999, Kendall’s Advanced Theory of Statistics. Vol. 2A (London: Hodder Arnold Publication) [Google Scholar]
King, I. R. 1971, PASP, 83, 199 [NASA ADS] [CrossRef] [Google Scholar]
King, I. R. 1983, PASP, 95, 163 [NASA ADS] [CrossRef] [Google Scholar]
Lattanzi, M. 2012, Mem. Soc. Astron. It. 83, 1033 [NASA ADS] [Google Scholar]
Lee, J.-F., & Van Altena, W. 1983, AJ, 88, 1683 [NASA ADS] [CrossRef] [Google Scholar]
Lemon, C. A., Auger, M. W., McMahon, R. G., & Koposov, S. E. 2017, MNRAS, 472, 5023 [NASA ADS] [CrossRef] [Google Scholar]
Lindegren, L. 1978, IAU Colloq., 48, 197 [NASA ADS] [Google Scholar]
Lindegren, L. 2008, Gaia DPAC Public Document GAIA-C3-TN-LU-LL-078 [Google Scholar]
Lindegren, L. 2010, ISSI Scientific Reports Series, 9, 279 [NASA ADS] [Google Scholar]
Lobos, R. A., Silva, J. F., Mendez, R. A., & Orchard, M. 2015, PASP, 127, 1166 [NASA ADS] [CrossRef] [Google Scholar]
McLean, I. S. 2008, Electronic Imaging in Astronomy: Detectors and Instrumentation (New York: Springer Science & Business Media) [Google Scholar]
Méndez, R. A., Costa, E., Pedreros, M. H., et al. 2010, PASP, 122, 853 [NASA ADS] [CrossRef] [Google Scholar]
Mendez, R. A., Silva, J. F., & Lobos, R. 2013, PASP, 125, 580 [NASA ADS] [CrossRef] [Google Scholar]
Mendez, R. A., Silva, J. F., Orostica, R., & Lobos, R. 2014, PASP, 126, 798 [NASA ADS] [Google Scholar]
Michalik, D., & Lindegren, L. 2016, A&A, 586, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Michalik, D., Lindegren, L., Hobbs, D., & Butkevich, A. G. 2015, A&A, 583, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Prusti, T., De Bruijne, J., Brown, A. G., et al. 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Rao, C. R. 1945, Bull. Calcutta Math. Soc., 37, 81 [Google Scholar]
Reffert, S. 2009, New Astron. Rev., 53, 329 [NASA ADS] [CrossRef] [Google Scholar]
So, H. C., Chan, Y. T., Ho, K., & Chen, Y. 2013, IEEE Signal Process. Mag., 30, 162 [NASA ADS] [CrossRef] [Google Scholar]
Stetson, P. B. 1987, PASP, 99, 191 [NASA ADS] [CrossRef] [Google Scholar]
Stone, R. C. 1989, AJ, 97, 1227 [NASA ADS] [CrossRef] [Google Scholar]
Vakili, M., & Hogg, D. W. 2016 ArXiv e-prints [arXiv:1610.05873] [Google Scholar]
Van Altena, W. F. 2013, Astrometry for Astrophysics: Methods, Models, and Applications (Cambridge: Cambridge University Press) [Google Scholar]
Van Altena, W. F., & Auer, L. 1975, in Image Processing Techniques in Astronomy (Berlin: Springer), 411 [NASA ADS] [CrossRef] [Google Scholar]
Van Trees, H. L. 2004, Detection, Estimation, and Modulation Theory, Part I: Detection, Estimation, and Linear Modulation Theory (New York: Wiley) [Google Scholar]
Zaccheo, T., Gonsalves, R., Ebstein, S., & Nisenson, P. 1995, ApJ, 439, L43 [NASA ADS] [CrossRef] [Google Scholar]
Zhang, J., Hao, Y. C., Wang, L., & Long, Y. 2016, PASP, 128, 035003 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A

Proof of Theorem 1

We begin presenting the expressions for $(ϵ_{J} (n), β_{J} (n), σ_{J}^{2} (n))$ $(\epsilon _{ J} (n), \beta _{ J} (n),\sigma ^2_{ J} (n))$ to complete the statement of the result. $ϵ_{J} (n) = \max_{t \in [0, 1]} | E_{I^{n} ~ f_{x c}} {\frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ_{j} ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j)} |,$ $\begin{aligned} \epsilon _{ J} (n) = \max _{t \in [0,1]}\left|\mathbb E _{I^n \sim f_{x_\mathrm{c}}} \left\{ \frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2}{\partial I_\mathrm{i}\partial I\mathrm{j}}\tau _{ J}(\bar{I}^n-t(I^n-\bar{I}^n))(I_\mathrm{i}-\bar{I}_\mathrm{i})(I\mathrm{j}-\bar{I}\mathrm{j}) \right\} \right|, \end{aligned}$ (A.1) $\begin{matrix} β_{J} (n) = ϵ_{J}^{'} (n) + 2 δ_{J}^{'} (n), \end{matrix}$ $\begin{aligned} \beta _{ J} (n) =\epsilon {\prime }_{ J} (n) + 2\delta {\prime }_{ J}(n), \end{aligned}$ (A.2)

where ${ϵ^{'}}_{J} (n) = \max_{t \in [0, 1]} E_{I^{n} ~ f_{x c}} {{(\frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ_{J} ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j))}^{2}},$ ${{{\epsilon }'}_{J}}(n)=\underset{t\in [0,1]}{\mathop{\max }}\,{{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\left\{ {{\left( \frac{1}{2}\sum\limits_{i=1}^{n}{\sum\limits_{j=1}^{n}{\frac{{{\partial }^{2}}}{\partial {{I}_{\text{i}}}\partial I\text{j}}{{\tau }_{J}}({{{\bar{I}}}^{n}}-t({{I}^{n}}-{{{\bar{I}}}^{n}}))({{I}_{\text{i}}}-{{{\bar{I}}}_{\text{i}}})(I\text{j}-\bar{I}\text{j})}} \right)}^{2}} \right\},$ (A.3) ${δ^{'}}_{J} (n) = \max_{t \in [0, 1]} | E_{I^{n} ~ f_{x c}} {(\nabla_{τ J} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n})) \cdot \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ_{J} ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j)} |,$ ${{{\delta }'}_{J}}(n)=\underset{t\in [0,1]}{\mathop{\max }}\,\left| {{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\left\{ ({{\nabla }_{\tau J}}({{{\bar{I}}}^{n}})\cdot ({{I}^{n}}-{{{\bar{I}}}^{n}}))\cdot \frac{1}{2}\sum\limits_{i=1}^{n}{\sum\limits_{j=1}^{n}{\frac{{{\partial }^{2}}}{\partial {{I}_{\text{i}}}\partial I\text{j}}{{\tau }_{J}}({{{\bar{I}}}^{n}}-t({{I}^{n}}-{{{\bar{I}}}^{n}}))({{I}_{\text{i}}}-{{{\bar{I}}}_{\text{i}}})(I\text{j}-\bar{I}\text{j})}} \right\} \right|,$ (A.4)

and, finally, $\begin{matrix} σ_{J}^{2} (n) = {[\frac{\partial^{2} J (τ_{J} ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α^{2}}]}^{- 1} [\frac{\partial^{2} J (τ_{J} ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α \partial I_{i}}] Cov {I^{n}} {[\frac{\partial^{2} J (τ_{J} ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α \partial I_{i}}]}^{T} {[\frac{\partial^{2} J (τ_{J} ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α^{2}}]}^{- 1} \cdot \end{matrix}$ $\begin{aligned} \sigma ^2_{ J} (n) = \left[\frac{\partial ^2J(\tau _{ J}(\bar{I}^n),\bar{I}^n)}{\partial \alpha ^2}\right]^{-1}\left[\frac{\partial ^2J(\tau _{ J}(\bar{I}^n),\bar{I}^n)}{\partial \alpha \partial I_i}\right]\mathrm{Cov}\{I^n\}\left[\frac{\partial ^2J(\tau _{ J}(\bar{I}^n),\bar{I}^n)}{\partial \alpha \partial I_i}\right]^T\left[\frac{\partial ^2J(\tau _{ J}(\bar{I}^n),\bar{I}^n)}{\partial \alpha ^2}\right]^{-1}\cdot \end{aligned}$ (A.5)

Proof of Theorem 1: using the chain rule in the cost function J(α, Iⁿ) and taking the partial derivative $\frac{\partial}{\partial I_{i}}$ $\frac{\partial }{\partial I_i}$ of both sides in Eq. (15), we have that $\begin{matrix} 0 = \frac{\partial^{2}}{\partial α^{2}} J (τ (I^{n}), I^{n}) \frac{\partial}{\partial I_{i}} τ (I^{n}) + \frac{\partial^{2}}{\partial α \partial I_{i}} J (τ (I^{n}), I^{n}), i = 1, \dots, n . \end{matrix}$ $\begin{aligned} 0=\frac{\partial ^2}{\partial \alpha ^2}J(\tau (I^n),I^n)\frac{\partial }{\partial I_i}\tau (I^n)+\frac{\partial ^2}{\partial \alpha \partial I_i}J(\tau (I^n),I^n),~ i=1,\ldots ,n. \end{aligned}$ (A.6)

Thus, we have n equations with one unknown, and it holds for any Iⁿ. Defining the operators $\begin{matrix} \nabla^{20} (\cdot) = \frac{\partial^{2}}{\partial α^{2}}, \nabla^{11} (\cdot) = \frac{\partial^{2}}{\partial α \partial I_{i}}, \end{matrix}$ $\begin{aligned} \nabla ^{20}(\cdot )=\frac{\partial ^2}{\partial \alpha ^2}, \nabla ^{11}(\cdot )=\frac{\partial ^2}{\partial \alpha \partial I_i}, \end{aligned}$ (A.7)

of dimensions 1 × 1 and 1 × n, respectively, we can express Eq. (A.6) in matrix form as $\begin{matrix} 0 = \nabla^{20} J (τ (I^{n}), I^{n}) \nabla τ (I^{n}) + \nabla^{11} J (τ (I^{n}), I^{n}) . \end{matrix}$ $\begin{aligned} 0=\nabla ^{20}J(\tau (I^n),I^n)\nabla \tau (I^n)+\nabla ^{11}J(\tau (I^n),I^n). \end{aligned}$ (A.8)

Assuming that the matrix ∇²⁰ J(τ(Iⁿ), Iⁿ) is non singular, we can calculate ∇_τ(I_n) from Eq. (A.8), $\begin{matrix} \nabla τ (I^{n}) = - {[\nabla^{20} J (τ (I^{n}), I^{n})]}^{- 1} \nabla^{11} J (τ (I^{n}), I^{n}) . \end{matrix}$ $\begin{aligned} \nabla \tau (I^n)=-[\nabla ^{20}J(\tau (I^n),I^n)]^{-1}\nabla ^{11}J(\tau (I^n),I^n). \end{aligned}$ (A.9)

Finally, using Eq. (A.9), evaluating at Īⁿ, and then replacing in Eq. (21), we have that $\begin{matrix} σ_{J}^{2} (n) & = - {[\nabla^{20} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} \nabla^{11} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n}) Cov {I^{n}} (- {[\nabla^{20} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} \nabla^{11} J {(τ ({\bar{I}}^{n}, {\bar{I}}^{n}))}^{T} \\ = {[\nabla^{20} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} \nabla^{11} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n}) Cov {I^{n}} [\nabla^{11} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n}))]^{T} {[\nabla^{20} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} . \\ = {[\frac{\partial^{2} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α^{2}}]}^{- 1} [\frac{\partial^{2} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α \partial I_{i}}] Cov {I^{n}} {[\frac{\partial^{2} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α \partial I_{i}}]}^{T} {[\frac{\partial^{2} J (τ ({\bar{I}}^{n}), {\bar{I}}^{n})}{\partial α^{2}}]}^{- 1} \cdot \end{matrix}$ $\begin{aligned} \sigma ^2_{ J}(n)&= -[\nabla ^{20}J(\tau (\bar{I}^n),\bar{I}^n)]^{-1}\nabla ^{11}J(\tau (\bar{I}^n),\bar{I}^n)\mathrm{Cov}\{I^n\}(-[\nabla ^{20}J(\tau (\bar{I}^n),\bar{I}^n)]^{-1}\nabla ^{11}J(\tau (\bar{I}^n,\bar{I}^n))^T\nonumber \\&= [\nabla ^{20}J(\tau (\bar{I}^n),\bar{I}^n)]^{-1}\nabla ^{11}J(\tau (\bar{I}^n),\bar{I}^n)\mathrm{Cov}\{I^n\}[\nabla ^{11}J(\tau (\bar{I}^n),\bar{I}^n))]^T[\nabla ^{20}J(\tau (\bar{I}^n),\bar{I}^n)]^{-1}.\nonumber \\&= \left[\frac{\partial ^2J(\tau (\bar{I}^n),\bar{I}^n)}{\partial \alpha ^2}\right]^{-1}\left[\frac{\partial ^2J(\tau (\bar{I}^n),\bar{I}^n)}{\partial \alpha \partial I_i}\right]\mathrm{Cov}\{I^n\}\left[\frac{\partial ^2J(\tau (\bar{I}^n),\bar{I}^n)}{\partial \alpha \partial I_i}\right]^T\left[\frac{\partial ^2J(\tau (\bar{I}^n),\bar{I}^n)}{\partial \alpha ^2}\right]^{-1}\cdot \end{aligned}$ (A.10)

Moving into the residual term γ_J(n) in Eq. (21) captured by β_J(n), we must consider the variance of the error function Var{e(I̅ⁿ, Iⁿ ‒ I̅ⁿ)} and the covariance $Cov {\nabla τ_{J} ({\bar{I}}^{n}) (I^{n} - {\bar{I}}^{n}), e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})}$ $\mathrm{Cov}\{\nabla \tau _{ J}(\bar{I}^n) (I^n-\bar{I}^n),e(\bar{I}^n,I^n-\bar{I}^n)\}$ . For the first, we have that $\begin{matrix} Var {e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} & = Var {\frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2} τ}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) (I_{i} - \bar{I_{i}}) (I j - \bar{I j})} \\ \leq E {{(\frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2} τ}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) (I_{i} - \bar{I_{i}}) (I j - \bar{I j}))}^{2}} \\ \leq \underset{= ϵ_{J}^{'} (n)}{\underset{⏟}{\max_{t \in [0, 1]} E {{(\frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2} τ}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) (I_{i} - \bar{I_{i}}) (I j - \bar{I j}))}^{2}}}} \cdot \end{matrix}$ $\begin{aligned} \mathrm{Var}\{ e(\bar{I}^n,I^n-\bar{I}^n)\}&= \mathrm{Var}\left\{ \frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2\tau }{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))(I_i-\bar{I_i})(I\mathrm{j}-\bar{I\mathrm{j}})\right\} \nonumber \\& \le \mathbb E \left\{ \left(\frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2\tau }{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))(I_i-\bar{I_i})(I\mathrm{j}-\bar{I\mathrm{j}}) \right)^2\right\} \nonumber \\&\le \underbrace{\max _{t \in [0,1]}\mathbb E \left\{ \left(\frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2\tau }{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))(I_i-\bar{I_i})(I\mathrm{j}-\bar{I\mathrm{j}}) \right)^2\right\} }_{ = \epsilon {\prime }_{ J}(n)}\cdot \end{aligned}$ (A.11)

On the other hand, for the covariance, using the main assumption in Eq. (22), it is clear that $\begin{matrix} E_{I^{n} \sim f_{x_{c}}} {\nabla τ_{ML} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n})} = E_{I^{n} \sim f_{x_{c}}} {a \sum_{i = 1}^{n} b_{i} (I_{i} - {\bar{I}}_{i})} = 0 . \end{matrix}$ ${{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\left\{ {{\nabla }_{\tau \text{ML}}}({{{\bar{I}}}^{n}})\cdot ({{I}^{n}}-{{{\bar{I}}}^{n}}) \right\}={{\mathbb{E}}_{{{I}^{n}}\tilde{\ }{{f}_{xc}}}}\left\{ \sum\limits_{i=1}^{n}{{{b}_{i}}({{I}_{i}}-{{{\bar{I}}}_{i}})} \right\}=0.$ (A.12)

From this, $\begin{array}{l} | Cov {\nabla τ ({\overset{ˉ}{I}}^{n}) (I^{n} - {\overset{ˉ}{I}}^{n}), e ({\overset{ˉ}{I}}^{n}, I^{n} - {\overset{ˉ}{I}}^{n})} | \\ = | E {\nabla τ ({\overset{ˉ}{I}}^{n}) (I^{n} - {\overset{ˉ}{I}}^{n}) (e ({\overset{ˉ}{I}}^{n}, I^{n} - {\overset{ˉ}{I}}^{n}) - E (e ({\overset{ˉ}{I}}^{n}, I^{n} - {\overset{ˉ}{I}}^{n})))} | \\ = | E_{I^{n} \sim f_{x_{c}}} {(\nabla τ ({\overset{ˉ}{I}}^{n}) \cdot (I^{n} - {\overset{ˉ}{I}}^{n})) \cdot \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ ({\overset{ˉ}{I}}^{n} - t (I^{n} - {\overset{ˉ}{I}}^{n})) (I_{i} - {\overset{ˉ}{I}}_{i}) (I j - \overset{ˉ}{I} j)} | \\ \leq \underset{= δ_{J}^{'} (n)}{\underset{︸}{\max_{t \in [0, 1]} | E_{I^{n} \sim f_{x_{c}}} {(\nabla τ ({\overset{ˉ}{I}}^{n}) \cdot (I^{n} - {\overset{ˉ}{I}}^{n})) \cdot \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ ({\overset{ˉ}{I}}^{n} - t (I^{n} - {\overset{ˉ}{I}}^{n})) (I_{i} - {\overset{ˉ}{I}}_{i}) (I j - \overset{ˉ}{I} j)} |}} \cdot \end{array}$ $\begin{aligned}&|\mathrm{Cov}\{\nabla \tau (\bar{I}^n) (I^n-\bar{I}^n),e(\bar{I}^n,I^n-\bar{I}^n)\}| \nonumber \\&= |\mathbb E \left\{ \nabla \tau (\bar{I}^n) (I^n-\bar{I}^n) \left( e(\bar{I}^n,I^n-\bar{I}^n)-\mathbb E \left( e(\bar{I}^n,I^n-\bar{I}^n)\right)\right) \right\} | \nonumber \\&= \left|\mathbb E _{I^n \sim f_{x_\mathrm{c}}} \left\{ (\nabla \tau (\bar{I}^n) \cdot (I^n-\bar{I}^n))\cdot \frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2}{\partial I_i\partial I\mathrm{j}}\tau (\bar{I}^n-t(I^n-\bar{I}^n))(I_i-\bar{I}_i)(I\mathrm{j}-\bar{I}\mathrm{j}) \right\} \right|\nonumber \\&\le \underbrace{ \max _{t \in [0,1]} \left|\mathbb E _{I^n \sim f_{x_\mathrm{c}}} \left\{ (\nabla \tau (\bar{I}^n) \cdot (I^n-\bar{I}^n))\cdot \frac{1}{2}\sum _{i=1}^n\sum _{j=1}^n\frac{\partial ^2}{\partial I_i\partial I\mathrm{j}}\tau (\bar{I}^n-t(I^n-\bar{I}^n))(I_i-\bar{I}_i)(I\mathrm{j}-\bar{I}\mathrm{j}) \right\} \right|}_{= \delta {\prime }_{ J}(n)}\cdot \end{aligned}$ (A.13)

Finally, replacing Eqs. (A.11) and (A.13) in the definition of γ_J(n), we have that $\begin{array}{l} | γ_{J} (n) | & \leq Var {e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} + 2 | Cov {\nabla τ ({\bar{I}}^{n}) (I^{n} - {\bar{I}}^{n}), e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} | \\ \leq {ϵ^{'}}_{J} (n) + 2 {δ^{'}}_{J} (n) = β_{J} (n) . \end{array}$ $\begin{array}{*{35}{l}} |{{\gamma }_{J}}(n)| & \le \text{Var}\{e({{{\bar{I}}}^{n}},{{I}^{n}}-{{{\bar{I}}}^{n}})\}+2|\text{Cov}\{\nabla \tau ({{{\bar{I}}}^{n}})({{I}^{n}}-{{{\bar{I}}}^{n}}),e({{{\bar{I}}}^{n}},{{I}^{n}}-{{{\bar{I}}}^{n}})\}| \\ {} & \le {{{{\epsilon }'}}_{J}}(n)+2{{{{\delta }'}}_{J}}(n)={{\beta }_{J}}(n). \\ \end{array}$ (A.14)

For the bias expression of the result in Eq. (24), using the hypothesis in Eq. (23), we can take expectation at both sides of Eq. (18) to obtain that $\begin{array}{l} | E_{I^{n} \sim f_{x_{c}}} {τ (I^{n})} - x_{c} | & = | E_{I^{n} \sim f_{x_{c}}} {a \sum_{i = 1}^{N} b_{i} (I_{i} - {\bar{I}}_{i}) + e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} | \\ = | E_{I^{n} \sim f_{x_{c}}} {e ({\bar{I}}^{n}, I^{n} - {\bar{I}}^{n})} | \\ = | E_{I^{n} \sim f_{x_{c}}} {\frac{1}{2} \sum_{i = n}^{n} \sum_{j = n}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j)} | \\ \leq \underset{= ϵ_{J} (n)}{\underset{︸}{\max_{t \in [0, 1]} | E_{I^{n} \sim f_{x_{c}}} {\frac{1}{2} \sum_{i = n}^{n} \sum_{j = n}^{n} \frac{\partial^{2}}{\partial I_{i} \partial I j} τ ({\bar{I}}^{n} - t (I^{n} - {\bar{I}}^{n})) (I_{i} - {\bar{I}}_{i}) (I j - \bar{I} j)} |}} \cdot \end{array}$ $\begin{array}{*{35}{l}} |{{\mathbb{E}}_{{{I}^{n}}\sim {{f}_{{{x}_{c}}}}}}\{\tau ({{I}^{n}})\}-{{x}_{c}}| & =|{{\mathbb{E}}_{{{I}^{n}}\sim {{f}_{{{x}_{c}}}}}}\left\{ a\sum\limits_{i=1}^{N}{{{b}_{i}}({{I}_{i}}-{{{\bar{I}}}_{i}})+e({{{\bar{I}}}^{n}},{{I}^{n}}-{{{\bar{I}}}^{n}})} \right\}| \\ {} & =|{{\mathbb{E}}_{{{I}^{n}}\sim {{f}_{{{x}_{c}}}}}}\left\{ e({{{\bar{I}}}^{n}},{{I}^{n}}-{{{\bar{I}}}^{n}}) \right\}| \\ {} & =|{{\mathbb{E}}_{{{I}^{n}}\sim {{f}_{{{x}_{c}}}}}}\left\{ \frac{1}{2}\sum\limits_{i=n}^{n}{\sum\limits_{j=n}^{n}{\frac{{{\partial }^{2}}}{\partial {{I}_{i}}\partial Ij}\tau ({{{\bar{I}}}^{n}}-t({{I}^{n}}-{{{\bar{I}}}^{n}}))({{I}_{i}}-{{{\bar{I}}}_{i}})(Ij-\bar{I}j)}} \right\}| \\ {} & \le \,\underbrace{\underset{t\in [0,1]}{\mathop{\max }}\,|{{\mathbb{E}}_{{{I}^{n}}\sim {{f}_{{{x}_{c}}}}}}\left\{ \frac{1}{2}\sum\limits_{i=n}^{n}{\sum\limits_{j=n}^{n}{\frac{{{\partial }^{2}}}{\partial {{I}_{i}}\partial Ij}\tau ({{{\bar{I}}}^{n}}-t({{I}^{n}}-{{{\bar{I}}}^{n}}))({{I}_{i}}-{{{\bar{I}}}_{i}})(Ij-\bar{I}j)}} \right\}|}_{={{\epsilon }_{J}}(n)}\cdot \\ \end{array}$ (A.15)

Appendix B

Proof of Theorem 2

Proof: the proof and, in particular, the derivation of $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ , β_WLS(n) and ϵ_WLS(n) simply reduces to a straightforward application of Theorem 1. For that we need to first validate the assumptions of Theorem 1. If we begin with Eq. (A.9), $\begin{matrix} \nabla τ (I^{n}) = - {[\nabla^{20} J (τ_{J} (I^{n}), I^{n})]}^{- 1} \nabla^{11} J (τ_{J} (I^{n}), I^{n}), \end{matrix}$ $\begin{aligned} \nabla \tau (I^n)=-[\nabla ^{20}J(\tau _{ J}(I^n),I^n)]^{-1}\nabla ^{11}J(\tau _{ J}(I^n),I^n), \end{aligned}$ (B.1)

and then we calculate the gradient terms on the right hand side (RHS) of Eq. (B.1) for our WLS context, it follows that $\begin{matrix} \nabla^{20} J_{WLS} (α, I^{n}) & = \frac{\partial^{2}}{\partial α^{2}} J_{WLS} (α, I^{n}) \\ = 2 \sum_{i = 1}^{n} w_{i} ({(\frac{\partial λ_{i} (α)}{\partial α})}^{2} + (λ_{i} (α) - I_{i}) \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}}), \end{matrix}$ $\begin{aligned} \nabla ^{20}J_\mathrm{WLS}(\alpha ,I^n)&= \frac{\partial ^2}{\partial \alpha ^2}J_\mathrm{WLS}(\alpha ,I^n)\nonumber \\&= 2\sum _{i=1}^nw_i\left(\left(\frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2+(\lambda _i(\alpha )-I_i)\frac{\partial ^2\lambda _i(\alpha )}{\partial \alpha ^2}\right), \end{aligned}$ (B.2) $\begin{matrix} \nabla^{11} J_{WLS} (α, I^{n}) & = {(\frac{\partial^{2}}{\partial α \partial I_{1}} J_{WLS} (α, I^{n}), \dots, \frac{\partial^{2}}{\partial α \partial I_{n}} J_{WLS} (α, I^{n}))}^{T} \end{matrix}$ $\begin{aligned} \nabla ^{11}J_\mathrm{WLS}(\alpha ,I^n)&= \left(\frac{\partial ^2}{\partial \alpha \partial I_1}J_\mathrm{WLS}(\alpha ,I^n),\ldots ,\frac{\partial ^2}{\partial \alpha \partial I_n}J_\mathrm{WLS}(\alpha ,I^n)\right)^ T\end{aligned}$ (B.3) $\begin{matrix} = - 2 {(w_{1} \frac{\partial λ_{1} (α)}{\partial α}, \dots, w_{n} \frac{\partial λ_{n} (α)}{\partial x_{c}})}^{T} \cdot \end{matrix}$ $\begin{aligned}&= -2\left( w_1\frac{\partial \lambda _1(\alpha )}{\partial \alpha },\ldots , w_n\frac{\partial \lambda _n(\alpha )}{\partial x_\mathrm{c}}\right)^ T\cdot \end{aligned}$ (B.4)

Following Eq. (B.1), we need to evaluate Eqs. (B.3) and (B.4) at α = τ_WLS(I̅ⁿ). For that, we have the following¹²: $\begin{matrix} τ_{WLS} ({\bar{I}}^{n}) = arg \min_{α \in ℝ} \sum_{i = 1}^{n} w_{i} {(λ_{i} (x_{c}) - λ_{i} (α))}^{2} . \end{matrix}$ $\begin{aligned} \tau _\mathrm{WLS}(\bar{I}^n)= \arg \,\min _{\alpha \in \mathbb R } \sum _{i=1}^nw_i(\lambda _i(x_\mathrm{c})-\lambda _i(\alpha ))^2. \end{aligned}$ (B.5)

Then we will use the following result.

Proposition 2

Under the assumption of a Gaussian PSF, τ_WLS(I̅ⁿ) = x_c.

We note that this proposition is the second assumption used in Theorem 1. Using this proposition, we obtain that $\begin{array}{l} \nabla^{20} J_{WLS} (τ_{WLS} ({\bar{I}}^{n}), {\bar{I}}^{n}) & 2 \sum_{i = 1}^{n} w_{i} ({{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}} + (λ_{i} (x_{c}) - λ_{i} (x_{c})) {\frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} |}_{α = x_{c}}), \\ = {2 \sum_{i = 1}^{n} w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}, \end{array}$ $\begin{array}{*{35}{l}} {{\nabla }^{20}}{{J}_{\text{WLS}}}({{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}}),{{{\bar{I}}}^{n}}) & 2\sum\limits_{i=1}^{n}{{{w}_{i}}}\left( {{\left. {{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}} \right|}_{\alpha ={{x}_{\text{c}}}}}+({{\lambda }_{i}}({{x}_{\text{c}}})-{{\lambda }_{i}}({{x}_{\text{c}}})){{\left. \frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}} \right|}_{\alpha ={{x}_{\text{c}}}}} \right), \\ {} & ={{\left. 2\sum\limits_{i=1}^{n}{{{w}_{i}}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}} \right|}_{\alpha ={{x}_{\text{c}}}}}, \\ \end{array}$ (B.6) $\begin{array}{l} \nabla^{11} J_{WLS} (τ_{WLS} ({\bar{I}}^{n}), {\bar{I}}^{n}) & = - 2 {{(w_{1} \frac{\partial λ_{1} (α)}{\partial α}, \dots, w_{n} \frac{\partial λ_{n} (α)}{\partial α})}^{T} |}_{α = τ_{WLS} ({\bar{I}}^{n})}, \\ = - 2 {{(w_{1} \frac{\partial λ_{1} (α)}{\partial α}, \dots, w_{n} \frac{\partial λ_{n} (α)}{\partial α})}^{T} |}_{α = x_{c}} \cdot \end{array}$ $\begin{array}{*{35}{l}} {{\nabla }^{11}}{{J}_{\text{WLS}}}({{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}}),{{{\bar{I}}}^{n}}) & =-2{{\left. {{\left( {{w}_{1}}\frac{\partial {{\lambda }_{1}}(\alpha )}{\partial \alpha },\ldots ,{{w}_{n}}\frac{\partial {{\lambda }_{n}}(\alpha )}{\partial \alpha } \right)}^{T}} \right|}_{\alpha ={{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}})}}, \\ {} & =-2{{\left. {{\left( {{w}_{1}}\frac{\partial {{\lambda }_{1}}(\alpha )}{\partial \alpha },\ldots ,{{w}_{n}}\frac{\partial {{\lambda }_{n}}(\alpha )}{\partial \alpha } \right)}^{T}} \right|}_{\alpha ={{x}_{\text{c}}}}}\cdot \\ \end{array}$ (B.7)

Finally, applying Eqs. (B.6) and (B.7) in Eq. (A.9), we have that $\begin{matrix} \nabla τ_{WLS} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n}) & = - {[\nabla^{20} J (τ_{WLS} ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} [\nabla^{11} J (τ_{WLS} ({\bar{I}}^{n}), {\bar{I}}^{n})] (I^{n} - {\bar{I}}^{n}) \\ = \underset{a}{\underset{⏟}{{[\sum_{i = 1}^{n} w_{i} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}]}^{- 1}}} \cdot \sum_{i = 1}^{n} \underset{b j}{\underset{⏟}{w j {\frac{\partial λ j (α)}{\partial α} |}_{α = x_{c}}}} (I j - E (I j)), \end{matrix}$ $\begin{aligned} \nabla \tau _\mathrm{WLS}(\bar{I}^n)\cdot (I^n-\bar{I}^n)&= -[\nabla ^{20}J(\tau _\mathrm{WLS}(\bar{I}^n),\bar{I}^n)]^{-1} [\nabla ^{11}J(\tau _\mathrm{WLS}(\bar{I}^n),\bar{I}^n)](I^n-\bar{I}^n) \nonumber \\&= \underbrace{\left[\sum _{i=1}^nw_i\left. \left( \frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}\right]^{-1}}_{a}\cdot \sum _{j=1}^n \underbrace{w\mathrm{j}\left.\frac{\partial \lambda \mathrm{j}(\alpha )}{\partial \alpha }\right|_{\alpha =x_\mathrm{c}}}_{b\mathrm{j}} (I\mathrm{j}-\mathbb E (I\mathrm{j})), \end{aligned}$ (B.8)

which offers the decomposition needed for the application of Theorem 1 (Eq. (22)). For the value of $σ_{WLS}^{2} (n)$ $\sigma ^2_\mathrm{WLS}(n)$ in Eq. (A.5), since the observations are independent and follow a Poisson distribution, we have that $\begin{matrix} Cov {I_{i}, I j} = {\begin{matrix} Var {I_{i}} = λ_{i} (x_{c}), & if i = j, \\ 0 & \sim . \end{matrix} \end{matrix}$ $\begin{aligned} \mathrm{Cov}\{I_i,I\mathrm{j}\}=\left\{ \begin{array}{l l} \mathrm{Var}\{I_i\}=\lambda _i(x_\mathrm{c}),&\quad \text{ if} \,i=j,\\ 0&\quad \sim . \end{array} \right. \end{aligned}$ (B.9)

Then if we replace Eqs. (B.6), (B.7), and (B.9) in Eq. (A.5), we have that $\begin{matrix} σ_{WLS}^{2} (n) & = \frac{\sum_{i = 1}^{n} w_{i}^{2} λ_{i} (x_{c}) {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}}{{(\sum_{i = 1}^{n} w_{i} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}})}^{2}} \cdot \end{matrix}$ $\begin{aligned} \sigma ^2_\mathrm{WLS}(n)&= \frac{\sum _{i=1}^nw_i^2\lambda _i(x_\mathrm{c})\left.\left(\frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}}{\left(\sum _{i=1}^nw_i\left.\left(\frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}\right)^2}\cdot \end{aligned}$ (B.10)

On the other hand, the expression for β_WLS(n) and ϵ_WLS(n) can be determined from the evaluation of Eqs. (A.2) and (A.1), respectively. Looking at them, the problem reduces to determine the key term $\frac{\partial^{2} τ_{WLS}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))$ $\frac{\partial ^2\tau _\mathrm{WLS}}{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))$ . For that, if we use Fessler (1996), Eq. (17) we can obtain the following identity¹³, $\begin{array}{l} \frac{\partial^{2} τ_{WLS}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) & = \frac{- 1}{{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]}^{2}} \cdot \\ [[[\sum_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 6 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α}] \cdot \\ \frac{(2 w j \frac{\partial λ j (α)}{\partial α})}{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} - 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]} - (2 w j \frac{\partial^{2} λ j (α)}{\partial α^{2}})] \\ (2 w_{i} \frac{\partial λ_{i} (α)}{\partial α}) - (2 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}}) \cdot (2 w j \frac{\partial λ j (α)}{\partial α})] |_{α = τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}, \end{array}$ $\begin{array}{*{35}{l}} \frac{{{\partial }^{2}}{{\tau }_{\text{WLS}}}}{\partial {{I}_{i}}\partial I\text{j}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}})) & =\frac{-1}{{{\left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}^{2}}}\cdot \\ {} & \left[ \left[ \left[ \underset{i=1}{\overset{n}{\mathop \sum }}\,\frac{{{\partial }^{3}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{3}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+6{{w}_{i}}\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right]\cdot \right. \right. \\ {} & \left. \frac{\left( 2w\text{j}\frac{\partial \lambda \text{j}(\alpha )}{\partial \alpha } \right)}{\left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}-2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}-\left( 2w\text{j}\frac{{{\partial }^{2}}\lambda \text{j}(\alpha )}{\partial {{\alpha }^{2}}} \right) \right]\cdot \\ {} & {{\left. \left. \left( 2{{w}_{i}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)-\left( 2{{w}_{i}}\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}} \right)\cdot \left( 2w\text{j}\frac{\partial \lambda \text{j}(\alpha )}{\partial \alpha } \right) \right] \right|}_{\alpha ={{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))}}\cdot \\ \end{array}\$ (B.11)

which concludes the result.

B.1. Proof of Proposition 2

Proof: using the function $h (α) = \sum_{i = 1}^{n} w_{i} {(λ_{i} (x_{c}) - λ_{i} (α))}^{2}$ $h(\alpha )=\sum _{i=1}^nw_i(\lambda _i(x_\mathrm{c})-\lambda _i(\alpha ))^2$ , we need to show that the minimum is reached only at α = x_c. From this, we have that h(α) ≥ 0 and it achieves its minimum at x_c. To prove uniqueness, let us assume that there is another position $x_{c}^{*} \neq x_{c}$ $x_\mathrm{c}^*\ne x_\mathrm{c}$ at which h is zero. Then $\begin{array}{l} h (x_{c}^{*}) & = \sum_{i = 1}^{n} w_{i} {(λ_{i} (x_{c}) - λ_{i} (x_{c}^{*}))}^{2} = 0 \\ \Leftrightarrow λ_{i} (x_{c}) = λ_{i} (x_{c}^{*}), \forall i \in {1, \dots, n} . \end{array}$ $\begin{array}{*{35}{l}} h(x_{\text{c}}^{*}) & =\sum\limits_{i=1}^{n}{{{w}_{i}}}{{({{\lambda }_{i}}({{x}_{\text{c}}})-{{\lambda }_{i}}(x_{\text{c}}^{*}))}^{2}}=0 \\ {} & \Leftrightarrow {{\lambda }_{i}}({{x}_{\text{c}}})={{\lambda }_{i}}(x_{\text{c}}^{*}),~~\forall i\in \{1,\ldots ,n\}. \\ \end{array}\$ (B.12)

The last identity is not possible, because if we use a Gaussian PSF there is at least one i ∈ {1, . . . , n} such that $λ_{i} (x_{c}) \neq λ_{i} (x_{c}^{*})$ $\lambda _i(x_\mathrm{c})\ne \lambda _i(x_\mathrm{c}^*)$ .

Proof of Eq. (B.11)

Proof: if we recall Fessler (1996), Eq. (17) and consider J_WLS(α, Iⁿ) as the cost function, we have that $\begin{matrix} \begin{matrix} \frac{\partial^{2} τ_{WLS}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) & = {[- \frac{\partial^{2} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}}]}^{- 1} ([\frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{3}} \cdot \frac{\partial τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I j} \\ + \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I j}] \cdot \frac{\partial τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}} \\ + \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I_{i}} \cdot \frac{\partial τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I j} + \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α \partial I_{i} \partial I j}) |_{α = τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{aligned} \frac{\partial ^2\tau _\mathrm{WLS}}{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))&= \left[-\frac{\partial ^2J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2} \right]^{-1}\left( \left[\frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^3}\cdot \frac{\partial \tau _\mathrm{WLS}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I\mathrm{j}} \right. \right. \\&\left. \left. + \frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I\mathrm{j}} \right]\cdot \frac{\partial \tau _\mathrm{WLS}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I_i}\right.\\&\left. + \frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I_i} \cdot \frac{\partial \tau _\mathrm{WLS}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I\mathrm{j}} +\frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha \partial I_i \partial I\mathrm{j}} \right) \Biggr |_{\alpha =\tau _\mathrm{WLS}(\bar{I}^n+t(I^n-\bar{I}^n))}, \end{aligned} \end{aligned}$ (B.13)

where from the definition of J_ML (α, Iⁿ) we have that $\begin{matrix} \frac{\partial^{2} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}} = \sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}, \end{matrix}$ $\begin{matrix} \frac{{{\partial }^{2}}{{J}_{\text{WLS}}}(\alpha ,{{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))}{\partial {{\alpha }^{2}}}=\sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}}, \\ \end{matrix}$ (B.14) $\begin{matrix} \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{3}} = \sum_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 6 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α}, \end{matrix}$ $\begin{aligned} \frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^3}=\sum _{i=1}^n \frac{\partial ^3 \lambda _i(\alpha )}{\partial \alpha ^3} \cdot (\lambda _i(\alpha )-(\bar{I}_i+t(I_i-\bar{I}_i)))2w_i +6w_i\frac{\partial ^2 \lambda _i(\alpha )}{\partial \alpha ^2}\frac{\partial \lambda _i(\alpha )}{\partial \alpha }, \end{aligned}$ (B.15) $\begin{matrix} \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I_{i}} = - (2 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}}), \end{matrix}$ $\begin{aligned} \frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I_i}=- \left(2w_i\frac{\partial ^2 \lambda _i(\alpha )}{\partial \alpha ^2 }\right), \end{aligned}$ (B.16) $\begin{matrix} \frac{\partial^{3} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α \partial I_{i} \partial I j} = 0 . \end{matrix}$ $\begin{aligned} \frac{\partial ^3J_\mathrm{WLS}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha \partial I_i \partial I\mathrm{j}}=0. \end{aligned}$ (B.17)

Concerning $\frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}}$ $\frac{\partial \tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I_i}$ is just the i-th component of the gradient in Eq. (B.1), then we use Eqs. (B.3) and (B.4) $\begin{array}{l} \frac{\partial τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}} & = \frac{- \frac{\partial^{2} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})))}{\partial α \partial I_{i}}}{\frac{\partial^{2} J_{WLS} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}}} \\ = \frac{(2 w_{i} \frac{\partial λ_{i} (α)}{\partial α})}{\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}} \cdot \end{array}$ $\begin{array}{*{35}{l}} \frac{\partial {{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))}{\partial {{I}_{i}}} & =\frac{-\frac{{{\partial }^{2}}{{J}_{\text{WLS}}}(\alpha ,{{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}})))}{\partial \alpha \partial {{I}_{i}}}}{\frac{{{\partial }^{2}}{{J}_{\text{WLS}}}(\alpha ,{{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))}{\partial {{\alpha }^{2}}}} \\ {} & =\frac{\left( 2{{w}_{i}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}{\mathop{\sum }_{i=1}^{n}\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}}\cdot \\ \end{array}$ (B.18)

Finally, replacing Eqs. (B.14)–(B.18) in Eq. (B.13), and evaluating in α = τ_WLS(I̅ⁿ + t(Iⁿ ‒ I̅ⁿ)), we obtain the desired result $\begin{array}{l} \frac{\partial^{2} τ_{WLS}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) & = \frac{- 1}{{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]}^{2}} \cdot \\ [[[\sum_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} + 6 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α}] \cdot \\ \frac{(2 w j \frac{\partial λ j (α)}{\partial α})}{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot (λ_{i} (α) - ({\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i}))) 2 w_{i} - 2 w_{i} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]} - (2 w j \frac{\partial^{2} λ j (α)}{\partial α^{2}})] \cdot \\ {(2 w_{i} \frac{\partial λ_{i} (α)}{\partial α}) - (2 w_{i} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}}) \cdot (2 w j \frac{\partial λ j (α)}{\partial α})] |}_{α = τ_{WLS} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))} \cdot \end{array}$ $\begin{array}{*{35}{l}} \frac{{{\partial }^{2}}{{\tau }_{\text{WLS}}}}{\partial {{I}_{i}}\partial I\text{j}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}})) & =\frac{-1}{{{\left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}^{2}}}\cdot \\ {} & \left[ \left[ \left[ \underset{i=1}{\overset{n}{\mathop \sum }}\,\frac{{{\partial }^{3}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{3}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}+6{{w}_{i}}\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right]\cdot \right. \right. \\ {} & \left. \frac{\left( 2w\text{j}\frac{\partial \lambda \text{j}(\alpha )}{\partial \alpha } \right)}{\left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot ({{\lambda }_{i}}(\alpha )-({{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})))2{{w}_{i}}-2{{w}_{i}}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}-\left( 2w\text{j}\frac{{{\partial }^{2}}\lambda \text{j}(\alpha )}{\partial {{\alpha }^{2}}} \right) \right]\cdot \\ {} & {{\left. \left. \left( 2{{w}_{i}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)-\left( 2{{w}_{i}}\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}} \right)\cdot \left( 2w\text{j}\frac{\partial \lambda \text{j}(\alpha )}{\partial \alpha } \right) \right] \right|}_{\alpha ={{\tau }_{\text{WLS}}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))}}\cdot \\ \end{array}$ (B.19)

Appendix C

Proof of Theorem 3

Proof: again the proof and the derivation of $σ_{ML}^{2} (n)$ $\sigma ^2_\mathrm{ML}(n)$ , β_ML(n) and ϵ_ML(n) reduce to apply Theorem 1. First, we need to validate the assumption of Theorem 1. Beginning with the equality in Eq. (B.1), it follows that $\begin{array}{l} \nabla^{20} J_{ML} (α, I^{n}) & = \frac{\partial^{2}}{\partial α^{2}} J_{ML} (α, I^{n}) \\ = - \sum_{i = 1}^{n} I_{i} \frac{1}{λ_{i}^{2} (α)} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2} + \sum_{i = 1}^{n} (I_{i} \frac{1}{λ_{i} (α)} - 1) \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}}, \end{array}$ $\begin{array}{*{35}{l}} {{\nabla }^{20}}{{J}_{\text{ML}}}(\alpha ,{{I}^{n}}) & =\frac{{{\partial }^{2}}}{\partial {{\alpha }^{2}}}{{J}_{\text{ML}}}(\alpha ,{{I}^{n}}) \\ {} & =-\sum\limits_{i=1}^{n}{{{I}_{i}}\frac{1}{\lambda _{i}^{2}(\alpha )}{{\left(\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha })}^{2}}}+\sum\limits_{i=1}^{n}{({{I}_{i}}\frac{1}{{{\lambda }_{i}}(\alpha )}-1\right)\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}}, \\ \end{array}$ (C.1) $\begin{array}{l} \nabla^{11} J_{ML} (α, I^{n}) & = {(\frac{\partial^{2}}{\partial α \partial I_{1}} J_{ML} (α, I^{n}), \dots, \frac{\partial^{2}}{\partial α \partial I_{n}} J_{ML} (α, I^{n}))}^{T} \\ = {(\frac{1}{λ_{1} (α)} \frac{\partial λ_{1} (α)}{\partial α}, \dots, \frac{1}{λ_{n} (α)} \frac{\partial λ_{n} (α)}{\partial α})}^{T} \cdot \end{array}$ $\begin{array}{*{35}{l}} {{\nabla }^{11}}{{J}_{\text{ML}}}(\alpha ,{{I}^{n}}) & ={{\left( \frac{{{\partial }^{2}}}{\partial \alpha \partial {{I}_{1}}}{{J}_{\text{ML}}}(\alpha ,{{I}^{n}}),\ldots ,\frac{{{\partial }^{2}}}{\partial \alpha \partial {{I}_{n}}}{{J}_{\text{ML}}}(\alpha ,{{I}^{n}}) \right)}^{T}} \\ {} & ={{\left( \frac{1}{{{\lambda }_{1}}(\alpha )}\frac{\partial {{\lambda }_{1}}(\alpha )}{\partial \alpha },\ldots ,\frac{1}{{{\lambda }_{n}}(\alpha )}\frac{\partial {{\lambda }_{n}}(\alpha )}{\partial \alpha } \right)}^{T}}\cdot \\ \end{array}$ (C.2)

To evaluate these two expression at α = τ_ML(I̅ⁿ) as required in Eq. (B.1), we use that¹⁴ $\begin{matrix} τ_{ML} ({\bar{I}}^{n}) = \arg \min_{α \in ℝ} \sum_{i = 1}^{n} - λ_{i} (x_{c}) \ln (λ_{i} (α)) + λ_{i} (α) . \end{matrix}$ $\begin{aligned} \tau _\mathrm{ML}(\bar{I}^n)= \arg \!\min _{\alpha \in \mathbb R }\sum _{i=1}^n-\lambda _i(x_\mathrm{c}) \ln (\lambda _i(\alpha ))+\lambda _i(\alpha ). \end{aligned}$ (C.3)

Then we will use the following result.

Proposition 3

Under the assumption of a Gaussian PSF, α = τ_ML(I̅_n) = x_c.

We notice again that this proposition is the second assumption used in Theorem 1. From this proposition, it follows that $\begin{array}{l} \nabla^{20} J (τ_{ML} ({\bar{I}}^{n}), {\bar{I}}^{n}) & = & - \sum_{i = 1}^{n} λ_{i} (x_{c}) \frac{1}{λ_{i}^{2} (x_{c})} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{a = x_{c}} + \sum_{i = 1}^{n} {(\frac{λ_{i} (x_{c})}{λ_{i} (x_{c})} - 1) \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} |}_{a = x_{c}}, \\ = & - \sum_{i = 1}^{n} \frac{1}{λ_{i}^{2} (x_{c})} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{a = x_{c}}, \end{array}$ $\begin{array}{*{20}{l}} {{\nabla ^{20}}J({\tau _{{\rm{ML}}}}({{\bar I}^n}),{{\bar I}^n})}& = &{ - \mathop \sum \limits_{i = 1}^n {\lambda _i}({x_c})\frac{1}{{\lambda _i^2({x_c})}}{{\left. {{{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}} \right|}_{a = {x_c}}} + \mathop \sum \limits_{i = 1}^n {{\left. {\left( {\frac{{{\lambda _i}({x_c})}}{{{\lambda _i}({x_c})}} - 1} \right)\frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}}} \right|}_{a = {x_c}}},}\\ {}& = &{ - \mathop \sum \limits_{i = 1}^n \frac{1}{{\lambda _i^2({x_c})}}{{\left. {{{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}} \right|}_{a = {x_c}}},} \end{array}$ (C.4) $\begin{array}{l} \nabla^{11} J (τ_{ML} ({\bar{I}}^{n}), {\bar{I}}^{n}) & = & {{(\frac{\partial^{2}}{\partial α \partial I_{1}} J (α, I^{n}), \dots, \frac{\partial^{2}}{\partial α \partial I_{n}} J (α, I))}^{T} |}_{a = τ_{ML} ({\bar{I}}^{n})} \\ = & {{(\frac{1}{λ_{1} (α)} \frac{\partial λ_{i} (α)}{\partial α}, \dots, \frac{1}{λ_{n} (α)} \frac{\partial λ_{n} (α)}{\partial α})}^{T} |}_{a = x_{c}} . \end{array}$ $\begin{array}{*{20}{l}} {{\nabla ^{11}}J({\tau _{{\rm{ML}}}}({{\bar I}^n}),{{\bar I}^n})}& = &{{{\left. {{{\left( {\frac{{{\partial ^2}}}{{\partial \alpha \partial {I_1}}}J(\alpha ,{I^n}), \ldots ,\frac{{{\partial ^2}}}{{\partial \alpha \partial {I_n}}}J(\alpha ,I)} \right)}^T}} \right|}_{a = {\tau _{{\rm{ML}}}}({{\bar I}^n})}}}\\ {}& = &{{{\left. {{{\left( {\frac{1}{{{\lambda _1}(\alpha )}}\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}, \ldots ,\frac{1}{{{\lambda _n}(\alpha )}}\frac{{\partial {\lambda _n}(\alpha )}}{{\partial \alpha }}} \right)}^T}} \right|}_{a = {x_c}}}.} \end{array}$ (C.5)

Finally, we apply Eqs. (C.4) and (C.5) in Eq. (A.9) to obtain that $\begin{matrix} \nabla τ_{ML} ({\bar{I}}^{n}) \cdot (I^{n} - {\bar{I}}^{n}) & = - {[\nabla^{20} J (τ_{ML} ({\bar{I}}^{n}), {\bar{I}}^{n})]}^{- 1} [\nabla^{11} J (τ_{ML} ({\bar{I}}^{n}), {\bar{I}}^{n})] (I^{n} - {\bar{I}}^{n}) \\ = \underset{a}{\underset{⏟}{- {[\sum_{i = 1}^{n} \frac{1}{λ_{i} (x_{c})} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}]}^{- 1}}} \cdot \sum_{j = 1}^{n} \underset{b j}{\underset{⏟}{\frac{1}{λ j (x_{c})} {\frac{\partial λ j (α)}{\partial α} |}_{α = x_{c}}}} (I j - E (I j)), \end{matrix}$ $\begin{aligned} \nabla \tau _\mathrm{ML}(\bar{I}^n)\cdot (I^n-\bar{I}^n)&= -[\nabla ^{20}J(\tau _\mathrm{ML}(\bar{I}^n),\bar{I}^n)]^{-1} [\nabla ^{11}J(\tau _\mathrm{ML}(\bar{I}^n),\bar{I}^n)](I^n-\bar{I}^n) \nonumber \\&= \underbrace{-\left[\sum _{i=1}^n\frac{1}{\lambda _i(x_\mathrm{c})}\left. \left( \frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}} \right]^{-1}}_{a}\cdot \sum _{j=1}^n\underbrace{\frac{1}{\lambda \mathrm{j}(x_\mathrm{c})}\left. \frac{\partial \lambda \mathrm{j}(\alpha )}{\partial \alpha } \right|_{\alpha =x_\mathrm{c}} }_{b\mathrm{j}}\left( I\mathrm{j}-\mathbb E (I\mathrm{j}) \right), \end{aligned}$ (C.6)

which shows that the sufficient condition in Eq. (22) of Theorem 1 is satisfied. To compute the value $σ_{ML}^{2} (n)$ $\sigma ^2_\mathrm{ML}(n)$ in Eq. (A.5), we have that $\begin{matrix} Cov {I_{i}, I j} = {\begin{matrix} Var {I_{i}} = λ_{i} (x_{c}), & if i = j, \\ 0 & \sim, \end{matrix} \end{matrix}$ $\begin{aligned} \mathrm{Cov}\{I_i,I\mathrm{j}\}=\left\{ \begin{array}{l l} \mathrm{Var}\{I_i\}=\lambda _i(x_\mathrm{c}),&\quad \text{ if}\, i=j,\\ 0&\quad \sim , \end{array} \right. \end{aligned}$ (C.7)

since the observations are independent and follow a Poisson distribution. Then, replacing Eqs. (C.4), (C.5), and (C.7) in Eq. (A.5), we have that $\begin{matrix} σ_{ML}^{2} (n) & = \frac{1}{\sum_{i = 1}^{n} \frac{1}{λ_{i} (x_{c})} {{(\frac{\partial λ_{i} (α)}{\partial α})}^{2} |}_{α = x_{c}}}, \end{matrix}$ $\begin{aligned} \sigma ^2_\mathrm{ML}(n)&= \frac{1}{\sum _{i=1}^n\frac{1}{\lambda _i(x_\mathrm{c})}\left.\left(\frac{\partial \lambda _i(\alpha )}{\partial \alpha }\right)^2\right|_{\alpha =x_\mathrm{c}}}, \end{aligned}$ (C.8)

which resolves the identity in Eq. (37). Finally β_ML(n) and ϵ_ML(n) come from evaluating Eqs. (A.2) and (A.1) in this ML context. For that we only need to determine $\frac{\partial^{2} τ_{ML}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))$ $\frac{\partial ^2\tau _\mathrm{ML}}{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))$ . Using Fessler (1996), Eq. (17), we can obtain the identity¹⁵ $\begin{matrix} \begin{matrix} \frac{\partial^{2} τ_{ML}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) = \frac{- 1}{{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]}^{2}} \cdot \\ [[[\sum_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - 3 \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α} \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} + 2 \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{3} (α)} {(\frac{\partial λ_{i} (α)}{\partial α})}^{3}] \cdot \\ \frac{(- \frac{1}{λ j (α)} \frac{\partial λ j (α)}{\partial α})}{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]} + \frac{\partial^{2} λ j (α)}{\partial α^{2}} \frac{1}{λ j (α)} - \frac{1}{λ j (α)} {(\frac{\partial λ j (α)}{\partial α})}^{2}] \cdot \\ (- \frac{1}{λ_{i} (α)} \frac{\partial λ_{i} (α)}{\partial α}) + (\frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{1}{λ_{i} (α)} - \frac{1}{λ_{i} (α)} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}) \cdot (- \frac{1}{λ j (α)} \frac{\partial λ j (α)}{\partial α})] |_{α = τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}, \end{matrix} \end{matrix}$ $\begin{align} & \frac{{{\partial }^{2}}{{\tau }_{ML}}}{\partial {{I}_{i}}\partial {{I}_{j}}}({{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}))=\frac{-\text{1}}{{{\left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot \frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{{{\lambda }_{i}}(\alpha )}-\frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{\lambda _{i}^{2}(\alpha )}\cdot {{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}^{2}}}. \\ & \left[ \left[ \left[ \sum\limits_{i=1}^{n}{\frac{{{\partial }^{3}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{3}}}\cdot \frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{{{\lambda }_{i}}(\alpha )}-3\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha }\frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{\lambda _{i}^{2}(\alpha )}+2\frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{\lambda _{i}^{3}(\alpha )}{{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{3}}} \right] \right. \right. \\ & \left. \frac{\left( -\frac{1}{\lambda \text{j(}\alpha \text{)}}\frac{\partial \lambda \text{j(}\alpha \text{)}}{\partial \alpha } \right)}{\left[ \sum\nolimits_{i=1}^{n}{\frac{{{\partial }^{2}}{{\lambda }_{i}}(\alpha )}{\partial {{\alpha }^{2}}}\cdot \frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{{{\lambda }_{i}}(\alpha )}-\frac{{{{\bar{I}}}_{i}}+t({{I}_{i}}-{{{\bar{I}}}_{i}})}{\lambda _{i}^{2}(\alpha )}\cdot {{\left( \frac{\partial {{\lambda }_{i}}(\alpha )}{\partial \alpha } \right)}^{2}}} \right]}+\frac{{{\partial }^{2}}\lambda \text{j(}\alpha \text{)}}{{{\partial }^{2}}\alpha }\frac{1}{\lambda \text{j(}\alpha \text{)}}-\frac{1}{\lambda \text{j(}\alpha \text{)}}{{\left( \frac{\partial \lambda \text{j(}\alpha \text{)}}{\partial \alpha } \right)}^{2}} \right]. \\ & {{\left. \left. \left( -\frac{1}{{{\lambda }_{i}}\text{(}\alpha \text{)}}\frac{\partial {{\lambda }_{i}}\text{(}\alpha \text{)}}{\partial \alpha } \right)+\left( \frac{{{\partial }^{2}}{{\lambda }_{i}}\text{(}\alpha \text{)}}{\partial {{\alpha }^{2}}}\frac{1}{{{\lambda }_{i}}\text{(}\alpha \text{)}}-\frac{1}{{{\lambda }_{i}}\text{(}\alpha \text{)}}{{\left( \frac{\partial {{\lambda }_{i}}\text{(}\alpha \text{)}}{\partial \alpha } \right)}^{2}} \right)\cdot \left( -\frac{1}{\lambda \text{j(}\alpha \text{)}}\frac{\partial \lambda \text{j(}\alpha \text{)}}{\partial \alpha } \right) \right] \right|}_{\alpha ={{\tau }_{\text{ML}}}\left( {{{\bar{I}}}^{n}}+t({{I}^{n}}-{{{\bar{I}}}^{n}}) \right)}} \\ \end{align}\$ (C.9)

which concludes the result.

Proof of Proposition 3

Proof: let us consider the function $g_{n} : R_{+}^{n} \to R$ $g_n:\mathbb R _+^n \rightarrow \mathbb R$ given by $\begin{matrix} g_{n} {(y_{1}, \dots, y_{n})}_{λ_{i}^{n}} = \sum_{i = 1}^{n} - λ_{i} ln (y_{i}) + y_{i} . \end{matrix}$ $\begin{aligned} g_n(y_1,\ldots ,y_n)_{\lambda _i^n}= \sum _{i=1}^n -\lambda _i \ln (y_i)+y_i. \end{aligned}$ (C.10)

We note that $\begin{matrix} min_{y_{1}^{n} \in R_{+}^{n}} g_{n} {(y_{1}, \dots, y_{n})}_{λ_{i}^{n}} = \sum_{i = 1}^{n} min_{y_{i} \in R_{+}} g_{1} {(y_{i})}_{λ_{i}}, \end{matrix}$ $\begin{aligned} \min _{y_1^n \in \mathbb R _+^n}g_n(y_1,\ldots ,y_n)_{\lambda _i^n}=\sum _{i=1}^n\min _{y_i \in \mathbb R _+}g_1(y_i)_{\lambda _i}, \end{aligned}$ (C.11)

where applying the first order condition y = λ_i, ∀i ∈ {1, . . . , n}. Returning to our problem in Eq. (C.3) where λ_i = I̅_i(x_c) and y_i = λ_i(α), it is clear, considering the Gaussian profile in PSF, that $\begin{matrix} λ_{i} (α) = λ_{i} (x_{c}) \forall i \in {1, \dots, n} if α = x_{c}, \end{matrix}$ $\begin{aligned} \lambda _i(\alpha ) =\lambda _i(x_\mathrm{c}) \ \ \ \ \forall i \in \{1,\ldots ,n \} \text{ if} \alpha =x_\mathrm{c}, \end{aligned}$ (C.12)

which concludes the result.

Proof of Eq. (C.9)

Proof: if we recall Fessler (1996, Eq. (17)) and consider J_ML(α, Iⁿ) as the cost function, we have that $\begin{matrix} \begin{matrix} \frac{\partial^{2} τ_{ML}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) & = {[- \frac{\partial^{2} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}}]}^{- 1} ([\frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{3}} \cdot \frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I j} \\ + \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I j}] \cdot \frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}} \\ + \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I_{i}} \cdot \frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I j} + \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α \partial I_{i} \partial I j}) |_{α = τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}, \end{matrix} \end{matrix}$ $\begin{aligned} \begin{aligned} \frac{\partial ^2\tau _\mathrm{ML}}{\partial I_i \partial I\mathrm{j}} (\bar{I}^n+t(I^n-\bar{I}^n))&= \left[-\frac{\partial ^2J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2} \right]^{-1}\left( \left[\frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^3}\cdot \frac{\partial \tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I\mathrm{j}} \right. \right. \\&\left. \left. + \frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I\mathrm{j}} \right]\cdot \frac{\partial \tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I_i}\right.\\&\left. + \frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I_i} \cdot \frac{\partial \tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I\mathrm{j}} +\frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha \partial I_i \partial I\mathrm{j}} \right) \Biggr |_{\alpha =\tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}, \end{aligned} \end{aligned}$ (C.13)

where from the definition of J_ML(α, Iⁿ) we have that $\begin{matrix} \frac{\partial^{2} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}} = \sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}, \end{matrix}$ $\frac{{{\partial ^2}{J_{{\rm{ML}}}}(\alpha ,{{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}{{\partial {\alpha ^2}}} = \sum\limits_{i = 1}^n {\frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}}} \cdot \frac{{{{\bar I}_t} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} - \frac{{{{\bar I}_t} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^2(\alpha )}} \cdot {\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)^2},$ (C.14) $\begin{matrix} \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{3}} = \sum_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - 3 \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α} \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} + 2 \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{3} (α)} {(\frac{\partial λ_{i} (α)}{\partial α})}^{3}, \end{matrix}$ $\frac{{{\partial ^3}{J_{{\rm{ML}}}}(\alpha ,{{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}{{\partial {\alpha ^2}}} = \sum\limits_{i = 1}^n {\frac{{{\partial ^3}{\lambda _i}(\alpha )}}{{\partial {\alpha ^3}}}} \cdot \frac{{{{\bar I}_t} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} - 3\frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}}\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}\frac{{{{\bar I}_t} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^2(\alpha )}} + 2\frac{{{{\bar I}_t} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^3(\alpha )}}{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)^3},$ (C.15) $\begin{matrix} \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2} \partial I_{i}} = \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{1}{λ_{i} (α)} - \frac{1}{λ_{i}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}, \end{matrix}$ $\begin{aligned} \frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha ^2 \partial I_i}=\frac{\partial ^2 \lambda _i(\alpha )}{\partial \alpha ^2} \cdot \frac{1}{\lambda _i(\alpha )}-\frac{1}{\lambda _i^2(\alpha )}\cdot \left( \frac{\partial \lambda _i(\alpha )}{\partial \alpha } \right)^2, \end{aligned}$ (C.16) $\begin{matrix} \frac{\partial^{3} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α \partial I_{i} \partial I j} = 0 . \end{matrix}$ $\begin{aligned} \frac{\partial ^3J_\mathrm{ML}(\alpha ,\bar{I}^n+t(I^n-\bar{I}^n))}{\partial \alpha \partial I_i \partial I\mathrm{j}}=0. \end{aligned}$ (C.17)

Concerning $\frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}}$ $\frac{\partial \tau _\mathrm{ML}(\bar{I}^n+t(I^n-\bar{I}^n))}{\partial I_i}$ , this expression is the i-th component of the gradient in Eq. (B.1), then we use Eqs. (C.1) and (C.2) $\begin{array}{l} \frac{\partial τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial I_{i}} & = & \frac{- \frac{\partial^{2} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α \partial I_{i}}}{\frac{\partial^{2} J_{ML} (α, {\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))}{\partial α^{2}}} \\ = & \frac{(- \frac{1}{λ i (α)} \frac{\partial λ_{i} (α)}{\partial α})}{\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}} . \end{array}$ $\begin{array}{*{20}{l}} {\frac{{\partial {\tau _{{\rm{ML}}}}({{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}{{\partial {I_i}}}}& = &{\frac{{ - \frac{{{\partial ^2}{J_{{\rm{ML}}}}(\alpha ,{{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}{{\partial \alpha \partial {I_i}}}}}{{\frac{{{\partial ^2}{J_{{\rm{ML}}}}(\alpha ,{{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}{{\partial {\alpha ^2}}}}}}\\ {}& = &{\frac{{\left( { - \frac{1}{{\lambda i(\alpha )}}\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}}{{\mathop \sum \limits_{i = 1}^n \frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}} \cdot \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} \cdot {{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}}}.} \end{array}$ (C.18)

Finally, replacing Eqs. (C.14)–(C.18) in Eq. (C.13), and evaluating in α = τ_ML(I̅ⁿ + t(Iⁿ ‒ I̅ⁿ)), we obtain the desired result $\begin{matrix} \frac{\partial^{2} τ_{ML}}{\partial I_{i} \partial I j} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n})) = \frac{- 1}{{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]}^{2}} \cdot \\ [[[Σ_{i = 1}^{n} \frac{\partial^{3} λ_{i} (α)}{\partial α^{3}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - 3 \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \frac{\partial λ_{i} (α)}{\partial α} \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{2} (α)} + 2 \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i}^{3} (α)} {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}] \cdot \\ \frac{(- \frac{1}{λ j(α)} \frac{\partial λ j(α)}{\partial α})}{[\sum_{i = 1}^{n} \frac{\partial^{2} λ_{i} (α)}{\partial α^{2}} \cdot \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{i} (α)} - \frac{{\bar{I}}_{i} + t (I_{i} - {\bar{I}}_{i})}{λ_{1}^{2} (α)} \cdot {(\frac{\partial λ_{i} (α)}{\partial α})}^{2}]} + \frac{\partial^{2} λ j(α)}{\partial α^{2}} \frac{1}{λ j(α)} {(\frac{\partial λ j(α)}{\partial α})}^{2}] \cdot \\ {(- \frac{1}{λ_{i} (α)} \frac{\partial λ_{i} (α)}{\partial α}) + (\frac{\partial^{2} λ j(α)}{\partial α^{2}} \frac{1}{λ_{i} (α)} - \frac{1}{λ_{i} (α)} {(\frac{\partial λ_{i} (α)}{\partial α^{2}})}^{2}) \cdot (- \frac{1}{λ j(α)} \frac{\partial λ j(α)}{\partial α})] |}_{α = τ_{ML} ({\bar{I}}^{n} + t (I^{n} - {\bar{I}}^{n}))} . \end{matrix}$ $\begin{array}{*{20}{l}} {\frac{{{\partial ^2}{\tau _{{\rm{ML}}}}}}{{\partial {I_i}\partial I{\rm{j}}}}({{\bar I}^n} + t({I^n} - {{\bar I}^n})) = \frac{{ - 1}}{{{{\left[ {\mathop \sum \limits_{i = 1}^n \frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}} \cdot \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} - \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^2(\alpha )}} \cdot {{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}} \right]}^2}}} \cdot }\\ {\left[ {\left[ {\left[ {\mathop \Sigma \limits_{i = 1}^n \frac{{{\partial ^3}{\lambda _i}(\alpha )}}{{\partial {\alpha ^3}}} \cdot \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} - 3\frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}}\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}\frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^2(\alpha )}} + 2\frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{\lambda _i^3(\alpha )}}{{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}} \right]} \right.} \right. \cdot }\\ {\left. {\frac{{\left( { - \frac{1}{{\lambda {\rm{j(}}\alpha {\rm{)}}}}\frac{{\partial \lambda {\rm{j(}}\alpha {\rm{)}}}}{{\partial \alpha }}} \right)}}{{\left[ {\sum _{i = 1}^n\frac{{{\partial ^2}{\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}} \cdot \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{{\lambda _i}(\alpha )}} - \frac{{{{\bar I}_i} + t({I_i} - {{\bar I}_i})}}{{\lambda _1^2(\alpha )}} \cdot {{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right)}^2}} \right]}} + \frac{{{\partial ^2}\lambda {\rm{j(}}\alpha {\rm{)}}}}{{\partial {\alpha ^2}}}\frac{1}{{\lambda {\rm{j(}}\alpha {\rm{)}}}}{{\left( {\frac{{\partial \lambda {\rm{j(}}\alpha {\rm{)}}}}{{\partial \alpha }}} \right)}^2}} \right] \cdot }\\ {{{\left. {\left. {\left( { - \frac{1}{{{\lambda _i}(\alpha )}}\frac{{\partial {\lambda _i}(\alpha )}}{{\partial \alpha }}} \right) + \left( {\frac{{{\partial ^2}\lambda {\rm{j(}}\alpha {\rm{)}}}}{{\partial {\alpha ^2}}}\frac{1}{{{\lambda _i}{\rm{(}}\alpha {\rm{)}}}} - \frac{1}{{{\lambda _i}{\rm{(}}\alpha {\rm{)}}}}{{\left( {\frac{{\partial {\lambda _i}(\alpha )}}{{\partial {\alpha ^2}}}} \right)}^2}} \right) \cdot \left( { - \frac{1}{{\lambda {\rm{j(}}\alpha {\rm{)}}}}\frac{{\partial \lambda {\rm{j(}}\alpha {\rm{)}}}}{{\partial \alpha }}} \right)} \right]} \right|}_{\alpha = {\tau _{{\rm{ML}}}}({{\bar I}^n} + t({I^n} - {{\bar I}^n}))}}.} \end{array}$ (C.19)

All Tables

Table 1.

Performance quality of the ML estimator relative to the Cramér–Rao bound expressed in terms of the indicator $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{{ML}}(n)+\beta _\mathrm{{ML}} (n)}-\sigma _\mathrm{{ML}}(n)}{\sigma _\mathrm{{ML}}(n)}$ from the result in Theorem 3.

In the text

Table 2.

Performance quality of the ML and WLS estimators relative to the nominal bound expressed in terms of the indicator $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{ML}(n)+\beta _\mathrm{ML} (n)}-\sigma _\mathrm{ML}(n)}{\sigma _\mathrm{ML}(n)}$ and $100 \times \frac{\sqrt{σ_{WLS}^{2} (n) + β_{WLS} (n)} - σ_{WLS} (n)}{σ_{WLS} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{WLS}(n)+\beta _\mathrm{WLS} (n)}-\sigma _\mathrm{WLS}(n)}{\sigma _\mathrm{WLS}(n)}$ , respectively, in the undersampled regime (space-based and ground-based are on the left and right, respectively).

In the text

All Figures

Fig. 1.

Relative performance of the bias (as measured by $log (100 \times \frac{ϵ_{J} (n)}{x_{c}})$ $\log \left( 100 \times \frac{\epsilon _{ J}(n)}{x_\mathrm{c}} \right)$ ) stipulated by Theorem 1 for the WLS estimator (left side, Eq. (29)) and the ML estimator (right side, Eq. (35)). Results are reported for different values of the source flux F̃ ∈ {1080, 3224, 20 004, 60, 160}, all in e⁻ (top to bottom symbols respectively), as a function of the detector pixel size. The 0% level corresponds to having achieved no bias.

In the text

	Fig. 2. Range of the square root of the variance performance (in miliarcsecond = mas) for the WLS method in astrometry using uniform weights (equivalent to the LS method) predicted by Theorems 1 and 2, Eq. (30). Results are reported for different representative values of F̃ and across different pixel sizes (top-left to bottom-right): F̃ ∈ {1080; 3224; 20 004; 60 160} e⁻.
In the text

Fig. 3.

Worst-case discrepancies in Eq. (40) for the WLS estimator using the weights set indexed by the positions $Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $\Theta = \{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ . Results are reported for two S/N scenarios, namely F̃ = 20 004 e⁻ (left) and F̃ = 60 160 e⁻ (right) (right), and across different pixel sizes.

In the text

Fig. 4.

Performance discrepancies (measuring the non-optimality) of the WLS estimator using the center position as a prior for the weight selection with respect to the CR bound obtained for the true object positions ${x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $\{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ . Results are reported for two S/N scenarios, namely F̃ = 20 004 e⁻ (left) and F̃ = 60 160 e⁻ (right), and across different pixel sizes.

In the text

	Fig. 5. Range of the square root of the variance performance (in miliarcsecond = mas) for the ML method in astrometry as predicted by Theorems 1 and 3, Eq. (36). Results are reported for different representative values of F̃ and across different pixel sizes (top-left to bottom-right): F̃ ∈ {1080, 3224, 20 004, 60 160} e^‒.
In the text

Fig. 6.

Performance optimality of the ML estimator (computed as $100 \times \frac{\sqrt{σ_{ML}^{2} (n) + β_{ML} (n)} - σ_{ML} (n)}{σ_{ML} (n)}$ $100\times \frac{\sqrt{\sigma ^2_\mathrm{ML}(n)+\beta _\mathrm{ML} (n)}-\sigma _\mathrm{ML}(n)}{\sigma _\mathrm{ML}(n)}$ ) for different positions of the target object $x_{c} \in Θ = {x_{o}^{*} - σ, x_{o}^{*} - 0.8 * σ, x_{o}^{*} - 0.6 * σ, x_{o}^{*} - 0.4 * σ, x_{o}^{*} - 0.2 * σ, x_{o}^{*}}$ $x_\mathrm{c}\in \Theta = \{x_\mathrm{o}^{*}-\sigma , x_\mathrm{o}^{*}-0.8*\sigma ,x_\mathrm{o}^{*}-0.6*\sigma ,x_\mathrm{o}^{*}-0.4*\sigma , x_\mathrm{o}^{*}-0.2*\sigma , x_\mathrm{o}^{*} \}$ in the array, as a function of pixel resolution. The left panel shows the case Fį = 20 004 e^‒, the right panel the case Fį = 60 160 e^‒.

In the text

	Fig. 7. Performance comparison between the $\sqrt{MSE}$ $\sqrt{{\rm{MSE}}}$ of the adaptive WLS estimator and σ_CR(n), both in mas. Results are reported for different F̂ and across different pixel sizes: (top-left to bottom-right) F̃ ∈ {1080, 3224, 60 160}e^‒.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Adorf, H.-M. 1996, in Astronomical Data Analysis Software and Systems V, 101, 13 [NASA ADS] [Google Scholar]

[2] Alard, C., & Lupton, R. H. 1998, ApJ, 503, 325 [NASA ADS] [CrossRef] [Google Scholar]

[3] Altmann, M., Bouquillon, S., & Taris, F. 2014, Proc. SPIE, 9149, 91490P [CrossRef] [Google Scholar]

[4] Auer, L., & Van Altena, W. 1978, AJ, 83, 531 [NASA ADS] [CrossRef] [Google Scholar]

[5] Bastian, U. 2004, GAIA Technical Note, 2004 BASNOCODE [Google Scholar]

[6] Bendinelli, O., Parmeggiani, G., Piccioni, A., & Zavatti, F. 1987, AJ, 94, 1095 [NASA ADS] [CrossRef] [Google Scholar]

[7] Benedict, G. F., McArthur, B. E., Nelan, E. P., & Harrison, T. E. 2016, PASP, 129, 012001 [Google Scholar]

[8] Bouquillon, S., Mendez, R., Altmann, M., et al. 2017, A&A, 606, A27 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[9] Bradley, R. A., & Gart, J. J. 1962, Biometrika, 49, 205 [CrossRef] [Google Scholar]

[10] Bristow, P., Kerber, F., & Rosa, M. 2006, in The 2005 HST Calibration Workshop: Hubble After the Transition to Two-Gyro Mode, 299 [Google Scholar]

[11] Cacciari, C., Pancino, E., & Bellazzini, M. 2016, Astron. Nachr., 337, 899 [NASA ADS] [CrossRef] [Google Scholar]

[12] Chromey, F. R. 2016, To Measure the Sky: An Introduction to Observational Astronomy (Cambridge: Cambridge University Press) [Google Scholar]

[13] Chun-Lin, L. 1993, IAU Symp., 156, 113 [NASA ADS] [Google Scholar]

[14] Cover, T. M., & Thomas, J. A. 2012, Elements of Information Theory (New York: Wiley) [Google Scholar]

[15] Cramér, H. 1946, Scand. Actuar. J., 1946, 85 [CrossRef] [Google Scholar]

[16] Echeverria, A., Silva, J. F., Mendez, R. A., & Orchard, M. 2016, A&A, 594, A111 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[17] Fessler, J. A. 1996, IEEE Trans. Image Process., 5, 493 [NASA ADS] [CrossRef] [Google Scholar]

[18] Gai, M., Busonero, D., & Cancelliere, R. 2017, PASP, 129, 054502 [NASA ADS] [CrossRef] [Google Scholar]

[19] Gray, R. M., & Davisson, L. D. 2004, An Introduction to Statistical Signal Processing (Cambridge: Cambridge University Press) [CrossRef] [Google Scholar]

[20] Hoadley, B. 1971, Ann. Math. Stat., 1977 [CrossRef] [Google Scholar]

[21] Høg, E. 2017, ArXiv e-prints [arXiv:1707.01020] [Google Scholar]

[22] Howell, S. B. 2006, Handbook of CCD Astronomy (Cambridge: Cambridge University Press), 5 [Google Scholar]

[23] Jakobsen, P., Greenfield, P., & Jedrzejewski, R. 1992, A&A, 253, 329 [NASA ADS] [Google Scholar]

[24] Janesick, J. R. 2001, Scientific Charge-Coupled Devices (Bellingham: SPIE Press), 83 [Google Scholar]

[25] Janesick, J. R. 2007, Photon Transfer (San Jose: SPIE Press) [CrossRef] [Google Scholar]

[26] Kay, S. M. 1993, Fundamentals of Statistical Signal Processing. Vol 1, Estimation Theory (Englewood Cliffs: Prentice-Hall) [Google Scholar]

[27] Kendall, M., Stuart, A., Ord, J., & Arnold, S. 1999, Kendall’s Advanced Theory of Statistics. Vol. 2A (London: Hodder Arnold Publication) [Google Scholar]

[28] King, I. R. 1971, PASP, 83, 199 [NASA ADS] [CrossRef] [Google Scholar]

[29] King, I. R. 1983, PASP, 95, 163 [NASA ADS] [CrossRef] [Google Scholar]

[30] Lattanzi, M. 2012, Mem. Soc. Astron. It. 83, 1033 [NASA ADS] [Google Scholar]

[31] Lee, J.-F., & Van Altena, W. 1983, AJ, 88, 1683 [NASA ADS] [CrossRef] [Google Scholar]

[32] Lemon, C. A., Auger, M. W., McMahon, R. G., & Koposov, S. E. 2017, MNRAS, 472, 5023 [NASA ADS] [CrossRef] [Google Scholar]

[33] Lindegren, L. 1978, IAU Colloq., 48, 197 [NASA ADS] [Google Scholar]

[34] Lindegren, L. 2008, Gaia DPAC Public Document GAIA-C3-TN-LU-LL-078 [Google Scholar]

[35] Lindegren, L. 2010, ISSI Scientific Reports Series, 9, 279 [NASA ADS] [Google Scholar]

[36] Lobos, R. A., Silva, J. F., Mendez, R. A., & Orchard, M. 2015, PASP, 127, 1166 [NASA ADS] [CrossRef] [Google Scholar]

[37] McLean, I. S. 2008, Electronic Imaging in Astronomy: Detectors and Instrumentation (New York: Springer Science & Business Media) [Google Scholar]

[38] Méndez, R. A., Costa, E., Pedreros, M. H., et al. 2010, PASP, 122, 853 [NASA ADS] [CrossRef] [Google Scholar]

[39] Mendez, R. A., Silva, J. F., & Lobos, R. 2013, PASP, 125, 580 [NASA ADS] [CrossRef] [Google Scholar]

[40] Mendez, R. A., Silva, J. F., Orostica, R., & Lobos, R. 2014, PASP, 126, 798 [NASA ADS] [Google Scholar]

[41] Michalik, D., & Lindegren, L. 2016, A&A, 586, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[42] Michalik, D., Lindegren, L., Hobbs, D., & Butkevich, A. G. 2015, A&A, 583, A68 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[43] Prusti, T., De Bruijne, J., Brown, A. G., et al. 2016, A&A, 595, A1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[44] Rao, C. R. 1945, Bull. Calcutta Math. Soc., 37, 81 [Google Scholar]

[45] Reffert, S. 2009, New Astron. Rev., 53, 329 [NASA ADS] [CrossRef] [Google Scholar]

[46] So, H. C., Chan, Y. T., Ho, K., & Chen, Y. 2013, IEEE Signal Process. Mag., 30, 162 [NASA ADS] [CrossRef] [Google Scholar]

[47] Stetson, P. B. 1987, PASP, 99, 191 [NASA ADS] [CrossRef] [Google Scholar]

[48] Stone, R. C. 1989, AJ, 97, 1227 [NASA ADS] [CrossRef] [Google Scholar]

[49] Vakili, M., & Hogg, D. W. 2016 ArXiv e-prints [arXiv:1610.05873] [Google Scholar]

[50] Van Altena, W. F. 2013, Astrometry for Astrophysics: Methods, Models, and Applications (Cambridge: Cambridge University Press) [Google Scholar]

[51] Van Altena, W. F., & Auer, L. 1975, in Image Processing Techniques in Astronomy (Berlin: Springer), 411 [NASA ADS] [CrossRef] [Google Scholar]

[52] Van Trees, H. L. 2004, Detection, Estimation, and Modulation Theory, Part I: Detection, Estimation, and Linear Modulation Theory (New York: Wiley) [Google Scholar]

[53] Zaccheo, T., Gonsalves, R., Ebstein, S., & Nisenson, P. 1995, ApJ, 439, L43 [NASA ADS] [CrossRef] [Google Scholar]

[54] Zhang, J., Hao, Y. C., Wang, L., & Long, Y. 2016, PASP, 128, 035003 [NASA ADS] [CrossRef] [Google Scholar]