Successive Approximation of Nonlinear Confidence Regions (SANCR)

. In parameter estimation problems an important issue is the approximation of the conﬁdence region of the estimated parameters. Especially for models based on diﬀerential equations, the needed computational costs require particular attention. For this reason, in many cases only linearized conﬁdence regions are used. However, despite the low computational cost of the linearized conﬁdence regions, their accuracy is often limited. To combine high accuracy and low computational costs, we have developed a method that uses only successive linearizations in the vicinity of an estimator. To accelerate the process, a principal axis decomposition of the covariance matrix of the parameters is employed. A numerical example illustrates the method.


Introduction
To simplify the notation, we consider a nonlinear model f (t, θ), with θ ∈ R n and t ∈ R, which does not depend on an additional (dynamical) system.We assume that f is differentiable with respect to θ and continuous with respect to t.
We consider the approximation of a confidence region about parameter values estimated by nonlinear least squares.The parameters are estimated by using experimental data y i in some given points t i with i = 1, . . ., m.The observed values contain unknown errors e i that we assume additive, so the response variable can be modeled by where θ true is the unknown true value of the parameters.Therefore, the least squares estimator θ is the value that solves the following problem thomas.carraro@iwr.uni-heidelberg.de where S(θ) is the residual sum of squares We assume that the model is correct and that the errors are normal, independent and identically distributed (iid) random variables with zero mean and variance σ 2 , i.e. e i ∼ N (0, σ 2 ).The confidence regions are here interpreted (from the frequentistic perspective [15]) as the regions in the parameter space covering the true value of the parameters θ true , in large samples, with probability approximately 1 − α.
The use of linearized confidence regions with nonlinear algebraic models has been extensively treated in literature, see for example [1,2,7,12,9,17].In particular, it has been shown that confidence regions derived for the linear case can be used in linearized form also for nonlinear models, but in many cases with limited accuracy [19].Furthermore, there are approximation techniques for nonlinear models that are not based on linearizations [11,3,20,18].
To simplify the exposition, in this work we consider an algebraic model, but the method can be used for more complex models.In fact, the problem to approximate nonlinear confidence regions for implicit models, i.e. models based on a system of (differential) equations has been considered from different points of view and for different kind of applications by several authors.To cite only few of them, see the work [19] and the references therein for the design under uncertainty, [21] for an application to ground water flow, [14] for ecological systems, and [16] for additional examples.Newly, it has been presented a method based on second-order sensitivity for the approximation of nonlinear confidence regions applied to ODE based models [13].It has been shown that higher order sensitivities give a higher accurate approximation of the confidence regions than methods using only the first order sensitivities.
With this work we show that the approximation using only linearized confidence regions can be substantially improved by a systematic successive application of linearizations, in the following called Successive Approximation of Nonlinear Confidence Regions (SANCR) method.We show results for the case with only two model parameters.An extension to more than two parameters is technically straightforward and could be partially parallelized, but the effect of successive linearizations in more than two (parameter space) dimensions has yet to be studied in this framework.
This paper is organized as follows i) In Section 2 we report the two methods on which our approach is based; ii) In Section 3 we describe the new method; iii) In Section 4 we show a numerical realization of the SANCR method.

Linearized confidence region and likelihood ratio test
As explained above, there are several methods to approximate (nonlinear) confidence regions.Our method is based on the following two approaches [20].
For a given estimator θ of the parameter θ, we consider: (i) The method derived from the likelihood ratio test (LR) from which it follows where L is the likelihood function and γ 2 is the confidence level.(ii) The method based on the Wald test that leads to the linearized confidence regions (CL): where Cov is the estimated covariance matrix of the parameters.There are several approximations of Cov [19], we use the one based on the Jacobian J of f : where The level γ 2 = χ 2 1−α,n is given by the 1 − α percentile of the chi-square distribution with n degrees of freedom in case σ 2 is known, and it is It has been proved [8] that these two confidence regions are asymptotically equivalent, but far from the asymptotic behavior, i.e. in case of a small number of data, they perform differently as presented in [19].Additionally, our method show the limitation of linearized confidence regions based only on (6).
One of the major goals in defining the confidence regions is the reduction of the costs associated to their computation.From the perspective of the computational costs, the method CL is cheap since it needs only one evaluation of the covariance matrix at the parameter value θ, while the method LR is much more expensive because it is based on the evaluation of the functional S in an adequately high number of points θ in the vicinity of θ to produce a contour.In addition, the extension of the confidence region is not known a priori.In practice, the number of function evaluations needed for the method LR is in the order of several thousands, for example in our case with two parameters we use a grid of 10 4 points for the method LR.
On the contrary, as indicated in the expression (7), the covariance matrix can be evaluated at the cost of building the Jacobian J. Therefore, the major computational costs for the method CL are given by the computation of the derivatives of the model f with respect to the parameters.Thus, we have few computations of a linearized model for the method CL while many thousand computations of a nonlinear model are needed for the method LR.Unfortunately, the accuracy of these two methods is inversely related to their computational costs, with the CL method being much more inaccurate if the model is highly nonlinear.We remind that both methods are only asymptotically exact for linear models and their quality decreases far from the asymptotic behavior.
Therefore, a compromise between computational costs and precision is highly required for many practical applications especially in case the model is based on differential equations.To this aim we established a new method combining low computational costs and high accuracy.

Successive linearizations of nonlinear confidence regions
The SANCR method is based on the use of successive linearizations of the confidence region, starting from the estimated parameter value θ (see expression ( 2)) combined with the likelihood ratio test (5) as explained below examplarily for a model with two parameters.
The likelihood ratio test is used to check whether a point belongs or not to the approximate nonlinear confidence region.Instead of testing all points in the vicinity of θ we use an educated guess, i.e. the likelihood ratio test is performed only on few points lying on the contour of the linearized confidence regions.In fact, linearized confidence regions are ellipsoids in the parameter space and the directions of the semi-axis are defined by the eigenvectors of the covariance matrix as can be deduced by the quadratic form (6). Note that the covariance matrix has dimension n × n, where n is the number of parameters to estimate.Therefore, starting from θ we determine the directions of the principal axes and their length which is given by i = γ λ i , where λ i is the eigenvalue corresponding to the i th eigenvector.We perform the likelihood ratio test for the extreme points of the semi-axes, see points θ A , θ B , θ C , θ D in Figure 1.
Let be θ A the first point to be processed.If this point passes the test, i.e. if the following condition is fulfilled it is considered for the construction of the confidence region and the procedure continues along the second axis.On the contrary, if the point θ A does not pass the test, it is discarded and a new candidate in the same direction θθ A is chosen.
A new point θ A along the selected semi-axis is taken by scaling 1 by a factor α < 1 as shown in Figure 2(a).This procedure is repeated with a new likelihood ratio test and possibly a rescaling (reducing α) until a point that satisfies the test Fig. 1: Definition of the points to perform the likelihood ratio test for two parameters.is found.Once this point, say θ new , has been found, we linearize the confidence region around this new point.To this aim we calculate the Jacobian J(θ new ) (see (8)) and the covariance Cov(θ new ) (see (7)).
After performing the eigendecomposition of the new covariance matrix, the principal axes might have changed direction due to the nonlinearity of the model, see Figure 2(b).Following the new principal directions, we can analogously find the next candidate points belonging to the confidence region, i.e. the points θ new,A , θ new,C and θ new,D , see Figure 2(b).The point θ new,B is not considered because it is the opposite extremal point of the same principal axis.In fact, instead of taking θ new,B , we perform the same procedure starting from θ B to approximate the confidence region in the direction θθ B .Therefore, this procedure is repeated along all principal axes considering both directions.

Stopping criterion
The search along one principal axis is stopped if the distance of the next accepted point, let's say θ new,A , to the previous one is less than a given tolerance then the point θ new,A is retained to define the nonlinear confidence region, see Figure 3.

Contour approximation
The countour of the nonlinear confidence region is approximated by connecting all retained points, in our case θ new,A , θ new,C , θ new,D , θ C , θ D and θ B .These points are linearly connected as shown in Figure 3(b).

Numerical results
As an example the following model is considered where the parameter θ 1 and θ 2 are estimated by the nonlinear least squares method.To simulate the parameter estimation process we have applied perturbed data generated using the "true" values of the parameters according to the following model response: where e i is a random variable distributed as N (0, σ 2 ).The Table 1 indicates the values θ true and σ 2 used in the calculations, and the least squares estimated values θ found by minimizing S(θ) for a realization of the observations y i .Additionally, the Table 2 includes the measurement positions t i .One stopping criterion of the SANCR method is that the distance of two successive candidates is smaller than a given tolerance T OL, see (9).We have used T OL = 0.15.
To evaluate the results of our approach we compare it with a Markov Chain Monte Carlo (MCMC) method described in [10] using the associated MCMC toolbox for Matlab.In fact, an alternative way to perform a statistical analysis of nonlinear models is the use of the Bayes's theorem [5].Bayesian inference is not the focus of our work, therefore we refer for example to [6] for a presentation of the Bayesian approach.Since the MCMC method does not allow to easily define a stopping criterion to assure convergence, we have set to 5 • 10 6 the number of model evaluations in the MCMC code.
In Figure 4 the approximations of the confidence region using the four methods can be qualitatively compared.The blue dots (for the colors see the electronic version) are the points of the MCMC method.The cyan ellipse is the linearized confidence region of the method CL.The green curve is the confidence region approximated by the method LR and the red curve is the confidence region approximated by the SANCR method.
One can observe that the linearized confidence region CL is much smaller than the MCMC approximation and that it is not centered in it.The SANCR method is an approximation of the confidence region defined by the method LR obtained at a much lower computational cost than the method LR itself.The computational costs are reported in tables 3 and 4. The method CL is very cheap with only one evaluation of the nonlinear model and the evaluations of the sensitivities with respect to the two parameters, but its quality is not satisfactory.The SANCR method uses 59 function evaluations and 42 ellipses.The latter correspond to 84 sensitivity evaluations according to the number of two parameters.The LR and the MCMC methods have been used here with 10 4 , respectively 5 • 10 6 , model evaluations.

Table 1 :
Parameters and variance

Table 4 :
Derivatives computations of the four methods SANCR CL LR MCMC Fig. 4: Confidence region approximated by the four methods.