The Optimization Solver

	This section describes Solver features available with the Optimization Module. See also Studies and Solvers in the COMSOL Multiphysics Reference Manual for more information about solvers in general.

	For a more extensive introduction to the mathematics implemented by this interface, see the Optimization Theory. For a more extensive treatment of the gradient-based solvers available in this node, see About Gradient-Based Solvers.

The , and settings are fundamental and can be controlled from an study step.

Choose to let an study step control the fundamental optimization method settings (the default). For specify the settings directly in this node.

Specify the , which has default value 1e-3. See About Gradient-Based Solvers. Note that this can be too strict, in particular if the forward multiphysics model is not solved accurately enough. See About Optimality Tolerances.

Specify the , which defaults to 1000. This number limits the number of times the objective function is evaluated, which in most cases is related to the number of times the multiphysics model is simulated for different values of the optimization control variable. Note, however, that it is not equal to the number of iterations taken by the optimizer because each iteration can invoke more than a single objective function evaluation. Furthermore, by setting this parameter to a smaller value and calling the optimization solver repeatedly, you can study the convergence rate and stop when further iterations with the optimization solver no longer have any significant impact on the value of the objective function.

This section contains settings related to the numerical methods that the solvers use.

The four available choices are (the default), , and . The Levenberg-Marquardt method can only be used for problems of least squares type without constraints or bounds on the control variables, while SNOPT, IPOPT, and MMA can solve any type of optimization problem. See About Gradient-Based Solvers.

This setting controls the behavior when the solver node under the Optimization solver node returns a solution containing more than one solution vector (for example, a frequency response). The SNOPT, IPOPT, and Levenberg-Marquardt solvers only support the setting, meaning in practice the sum over frequencies and parameters or the last time step. For MMA, the options are as for the derivative-free solvers: , , , , , and . The last two settings make the MMA algorithms handle maximin and minimax problems efficiently.

	When optimizing over a Time Dependent study step using a gradient-based solver, the objective and its gradient are always evaluated only for the last time step. MMA still presents multiple options, but these are effectively ignored since there is only one objective value that can be used.

When SNOPT,IPOPT, or MMA is used, the expression used as objective function can be controlled through this setting. The default is , in which case the sum of all objective contributions not deactivated in an study step are used as objective function.

By selecting , you can enter an expression that is used as the objective function in the field. The expression all_obj_contrib represents the sum of all objective contributions not deactivated in a controlling study step. Hence, this expressions leads to the same optimization problem as selecting . Note, however, that MMA treats least-squares objective contributions in a more efficient way when is selected.

When you use Levenberg-Marquardt, the objective function is always the sum of all active least-squares objective contributions present in the model.

SNOPT, IPOPT, MMA and Levenberg-Marquardt are gradient-based methods. The gradient can be computed according to the choices (default), , and . The latter is not supported by MMA. When is chosen, either the adjoint method or the forward method is used to compute the gradient analytically. The adjoint method is used when the number of optimization degrees of freedom is larger than the number of objective functions plus the number of global and integral constraints plus two, otherwise the forward method is used.

It is also possible to explicitly choose to use either the adjoint or forward method using the corresponding alternatives from the menu. With the option a semi-analytic approach is available where the gradient of the PDE residual with respect to control variables is computed by numerical perturbation and then substituted into the forward analytic method. When is chosen, finite differences are used to compute the gradient numerically.

	When the number of control variables is large, calculating the gradient numerically or with forward sensitivity can be time consuming.

For time-dependent problems, all analytic gradient methods have options to adjust the default integration tolerances for the sensitivity solver.

For the and gradient methods a can be specified. This factor multiplied by the forward problem relative tolerance to calculate the relative tolerance for the sensitivity solver. You can also specify a , which is a global absolute tolerance that is scaled with the initial conditions. The absolute tolerance for the sensitivity solution is updated if scaled absolute tolerances are updated for the forward problem.

When using the gradient method, an and factors can be given, which control the accuracy of the adjoint solution, similarly to the corresponding Forward sensitivity factors. In addition an and an can be given. These settings control the relative and absolute accuracy of time integrals (or quadratures) used to calculate objective function gradients. Note that the absolute tolerance is unscaled.

The gradient method uses checkpointing to reduce the amount of data which needs to be stored from the forward to the backward (adjoint) solution stage. Optionally, set the number of to control the number of checkpoints stored.

When the gradient method is selected, you can further specify a (default 1.5E-8). This is the relative magnitude of the numerical perturbations to use for first-order gradient approximation in and for all numeric differentiation in the solver. The former automatically chooses between first- and second-order gradient approximation, using the specified relative (default 6.0E-6) for central differencing.

For the method you can choose the explicitly. Selecting gives a less accurate gradient, while selecting gives a better approximation of the gradient. However, requires twice as many evaluations of the objective function for each gradient compared to . In many applications, the increased accuracy obtained by choosing is not worth this extra cost.

The sensitivity of the objective function is by default stored in the solution object such that it can be postprocessed after the solver has completed. To save memory by discarding this information, change to . Instead choosing , sensitivity information is also computed continuously during solution and made available for probing and plotting while solving. This is the most expensive option.

When using , you have the possibility to specify which solver to use for solving linear systems containing a reduced Hessian approximation, which is in principle a full matrix. Solving a system involving this matrix is necessary in order to take a single step in the active-set algorithm used for solving the QP subproblems that are formed during each major SQP iteration. Select one of the following strategies from the list:

•

— This option computes the full Cholesky factor of the reduced Hessian at the start of each major iteration. As the QP iterations (minor iterates) proceed, the dimension of the Cholesky factor changes with the number of superbasic variables and the factor is updated accordingly. If the number of superbasic variables increases beyond a preset limit (1000), the reduced Hessian cannot be stored and the solver switches to conjugate gradient.

•

— This method uses the conjugate-gradient method to solve all systems involving the reduced Hessian, which is only accessed implicitly in the form of a black-box linear operator on the superbasic variables. Since no data is stored between inner iterations, the method is most appropriate when the number of superbasics is large but each QP subproblem requires relatively few minor iterations. Selecting also triggers a limited-memory procedure which stores only a fixed number of BFGS update vectors together with a diagonal Hessian approximation between major iterations.

•

— This method uses a quasi-Newton strategy to update an approximation of the Cholesky factor of the reduced Hessian as the iterations proceed. It has the same memory requirement as the Cholesky option, but does not recompute the complete Cholesky factor at the beginning of each major iteration. It can be an appropriate choice when the number of superbasics is large but the nonlinear problem is well-scaled and well-behaved such that relatively few major iterations are needed for the approximate Hessian to stabilize.

	The Quasi-Newton option for solving reduced Hessian systems must not be confused with the fact that the major SNOPT iterations always use a quasi-Newton BFGS strategy to approximate the full Hessian — also when using Cholesky factorization or conjugate gradients to solve the reduced systems.

	The constraint Jacobian matrix is always assumed to be sparse such that sparse LU factors of the basic part of this matrix can be stored and updated when the set of superbasic variables changes. These factors are used implicitly to define the null space of the active constraints and appear in the implicit representation of the reduced gradient and Hessian.

In the field you can enter an expression that tells the optimization solver to reduce the step length in the current line search used by SNOPT to generate the next iterate.

The solver uses the condition to restrain the iterates from entering into areas in the control-variable space where the PDE problem is not well defined. A typical example is when a mesh element becomes inverted during geometry optimization using a Moving Mesh interface. A step limit condition that identifies this situation might be of the form minqual1_ale-0.05, where 0.05 is a threshold value for the mesh quality. This step limit condition has a direct analog in the stop condition for the time-dependent and parametric solvers.

	Only use the step limit condition as a last resort to keep the optimization solver in a feasible region. Instead, if possible, use pointwise constraints on the optimization variables to enforce the condition.

When the step limit condition is violated, the solver reduces the line-search step until an acceptable point is found. However, because no Jacobian is computed for the step limit condition, there is no mechanism to prevent the solver from immediately attempting another step in the same infeasible direction. As a result, the solver might get stuck at the same point without converging until it reaches the maximum number of model evaluations or you stop the iteration manually.

You can specify a linesearch tolerance as a value between 0 and 1 in the field (default value: 0.9). This controls the accuracy with which a step length will be located along the direction of search in each iteration. At the start of each linesearch, a target directional derivative for the merit function is identified. This parameter determines the accuracy to which this target value is approximated:

Each search will require only 1–5 function values (typically), but many function calls are then needed to estimate missing gradients for the next iteration.

From the list, choose (the default) or . At each major iteration a linesearch is used to improve the merit function. A derivative linesearch uses safeguarded cubic interpolation and requires both function and gradient values to compute estimates of the step. If some analytic derivatives are not provided, or a nonderivative linesearch is specified, SNOPT uses a linesearch based on safeguarded quadratic interpolation, which does not require gradient evaluations.

A nonderivative linesearch can be slightly less robust on difficult problems, and it is recommended that you use the default derivative linesearch if the functions and derivatives can be computed at approximately the same cost. If the gradients are very expensive relative to the functions, a nonderivative linesearch may give a significant decrease in computation time.

Absolute tolerance on the dual infeasibility. Successful termination requires that the max-norm of the (unscaled) dual infeasibility is less than the . The default value is 1.

Absolute tolerance on the constraint violation. Successful termination requires that the max-norm of the (unscaled) constraint violation is less than the . The default value is 0.1.

Absolute tolerance on the complementarity. Successful termination requires that the max-norm of the (unscaled) complementarity is less than the . The default value is 0.1.

You can choose afor the step computations. The options are and . The default value is .

If is selected, you can choose . The default value is 1000. A small value can reduce memory requirements at the expense of computational time.

The option makes it possible to bound the maximum absolute change for any (scaled) control variable between two outer iterations This is particularly relevant, when the 1987 version of the algorithm is used, because this does not have an inner loop to ensure improvement of the objective and satisfaction of constraints.

By default, the MMA solver continues to iterate until the relative change in any control variable is less than the optimality tolerance. If the option is enabled, the solver stops either on the tolerance criterion or when the number of iterations is more than the maximum specified.

The Optimization Module’s globally convergent version of the MMA solver has an inner loop which ensures that each new outer iteration point is feasible and improves on the objective function value. By default, the iteration is 10. When the maximum number of inner iterations is reached, the solver continues with the next outer iteration. It is possible to use the classical implementation of the MMA solver without inner iterations by clearing the check box.

The is multiplied by the optimality tolerance to provide an internal tolerance number that is used in the MMA algorithm to determine if the approximations done in the inner loop are feasible and improve on the objective function value. The default is 0.1. Decrease the factor to get stricter tolerances and a more conservative solver behavior.

The MMA algorithm penalizes violations of the constraints by a number that is calculated as the specified times 1e-4 divided by the optimality tolerance. Increasing this factor for a given optimality tolerance forces the solver to better respect constraints, while relatively decreasing the influence of the objective function.

The Levenberg-Marquardt method controls the step length and direction through a positive scalar regularization parameter. A value close to zero means that the optimization solver takes a step close to a full Gauss-Newton step. A large value means that it takes a small step close to the steepest-descent direction. See The Levenberg-Marquardt Solver.

The Levenberg-Marquardt method controls this penalty factor internally and tries to have as small penalty as possible in order to approach second-order Newton convergence. Therefore, a small value of the means that the solver tries to be aggressive initially, while a large value means that the solver is more cautious.

Select the check box to plot the results while solving the model. Select a from the list and any applicable .

Use the list to specify whether to try to assemble the complete Jacobian if an incomplete Jacobian has been detected. Select:

•

(the default) to try to assemble the complete Jacobian if an incomplete Jacobian has been detected. If the assembly of the complete Jacobian fails or in the case of nonconvergence, a warning is written and the incomplete Jacobian is used in the sensitivity analysis for stationary problems. For time-dependent problems, an error is returned.

•

to try to assemble the complete Jacobian if an incomplete Jacobian has been detected. If the assembly of the complete Jacobian fails or in the case of nonconvergence, an error is returned.

See Theory for Stationary Sensitivity Analysis for details about the algorithm.

In this section you can define constants that can be used as temporary constants in the solver. You can use the constants in the model or to define values for internal solver parameters. Click the (

) button to add a constant and then define its name in the column and its value (a numerical value or parameter expression) in the column. By default, any defined parameters are first added as the constant names, but you can change the names to define other constants. Click (

) to remove the selected constant from the list.