About Optimality Tolerances
The optimality tolerance is an important setting for all optimization solvers. It is intended to represent the relative accuracy in the final scaled control variable values, but because of the wide differences between different solver implementations, uniform behavior cannot be guaranteed.
In particular, the optimality tolerance can play tricks on you if your objective function or your optimization variables are badly scaled. Therefore, take care to specify correct scales for your control variables and make sure that objective functions and constraints are of order 1 — or at least not too far from — for reasonable values of the control variables.
Tweaking the Optimality tolerance parameter might be necessary if you are confronted with problems related to convergence. As an example, if the optimization solver reports a converged solution after just a few iterations, try to restart it with a tighter tolerance to make sure it has actually found the solution. If, on the contrary, it seems to iterate forever — despite the value of the objective function having converged (check the output on the Log page in the Progress window) — chances are that the tolerance value is too strict.
Optimality Tolerance for Derivative-Free Methods
For the derivative-free optimization methods, the optimization tolerance, with a default value of 0.01, is used to determine whether a stationary point has been reached. The BOBYQA, COBYLA, and Nelder–Mead methods stop iterating as soon as no improvement over the current best estimate can be found with steps in the scaled control variables of relative size larger than or equal to the optimality tolerance. The EGO method stops iterating as soon as no improvement over the weighted average of the ten most recent scaled estimates larger than the optimality tolerance can be found.
Compared to gradient-based optimization methods, which improve based upon the gradient of the objective function with respect to control variables, derivative-free methods explore the region around the current point by function evaluations only and use that information for determining convergence.
Optimality Tolerance for SNOPT
For SNOPT, the optimality tolerance parameter (corresponding to the major optimality tolerance in Ref. 6 and further explained together with parameter Opttol), with a default of 1.0·103, is used by the linear and quadratic solvers to determine, on the basis of the reduced-gradient size, whether optimality has been reached. More precisely, it regulates the accuracy to which the final iterate in SNOPT is required to fulfill the first-order conditions for optimality.
When SNOPT cannot achieve the requested tolerance level, the solver eventually returns a solution together with a warning message as follows:
Theoretically the Optimality tolerance should not be set smaller than the square-root of the function precision. The latter is the expected stability of the numerical model rather than its accuracy as a model of physical reality. When using a direct linear solver on a linear model, the function precision is generally of the same order as the inverse of the condition number. For a nonlinear or iterative solver, you can expect the precision to be of the same order as the solver tolerances, which is then also the numerical precision in the evaluation of the objective and constraints.
Furthermore, even when you set the Optimality tolerance based on the function precision, the same exit condition might occur. At present, the only remedy is to increase the accuracy of the function calculation, using all available means.
Optimality Tolerance for IPOPT
There are three tolerances for IPOPT:
Dual infeasibility absolute tolerance measures the sensitivity of the objective for unconstrained problems. For constrained problems, the sensitivity of the objective can be arbitrarily large at optimality, so an equation involving the constraints is used instead; see The IPOPT Solver.
Constraint violation absolute tolerance, which can be decreased to reduce constraint violations. However, note that scaling of the objective and constraint functions can often achieve the same effect without introducing other numerical issues.
Complementary conditions absolute tolerance is similar to the dual infeasibility except it also considers the bounds on the controls, because the sensitivity with respect to controls with active bounds can be arbitrarily large at optimality.
See also IPOPT-Specific Settings and Termination under the The IPOPT Solver.
Optimality Tolerance for MMA
The MMA solver terminates when the relative change in all scaled control variables is less than the specified optimality tolerance parameter, with a default of 1e-3. The relative change is defined as the change in the variable since the last outer iteration divided by the range of the variable. The range of the variable is the upper bound minus the lower bound. For unbounded variables, the MMA solver internally estimates bounds based on the previous iteration points.
Optimality Tolerance for Levenberg–Marquardt
Let tol be the specified optimality tolerance. Define told = γdtol, where γd is the defect reduction tolerance factor, and tolx = γxtol, where γx is the control variable tolerance factor. Moreover, let the defect vector be defined by
where ωl and fl are defined in Equation 5-11 and Equation 5-12, and L is the total number of the measurement evaluations. Then, when the Levenberg–Marquardt solver is used, the following conditions are used to determine when optimality has been reached:
where d0 is the initial defect vector, and dj is the current defect vector.
Terminate when the relative increment of the scaled control variable x is below the control variable tolerance; that is,
This is referred to as the stepsize in the Convergence Plot and in the Log Window
where dj is the current defect vector and J is the Jacobian. This is referred to as the errJ in the Convergence Plot and in the Log Window.
The default values of the optimality tolerance, defect reduction tolerance factor, and control variable tolerance factor are 1.0·103, 1, and 1, respectively. The termination condition defined as the first condition above is not used by default and should be enabled in order to be included. The minimum of the later two, min(errJ,stepsize), is shown in the Convergence Plot.