Welcome to simplenlopt’s documentation!

Features

  • SciPy like minimize(method=’NLopt algorithm’) API for NLopt’s local optimizers

  • Automatic numerical approximation of the gradient if analytical gradient is not available

  • Automatic handling of constraints via the augmented lagrangian method without boilerplate code

  • Scipy like interfaces to NLopt’s global optimizers with hard stopping criteria

  • SciPy like curve fitting using NLopt’s algorithms

Installation

pip install simplenlopt

Example: Minimizing the Rosenbrock function in simplenlopt and scipy

import simplenlopt
from scipy.optimize import rosen, rosen_der
import scipy
import numpy as np

x0 = np.array([0.5, 1.8])

res = simplenlopt.minimize(rosen, x0, jac = rosen_der)
print("Found optimum: ", res.x)

res_scipy = scipy.optimize.minimize(rosen, x0, jac = rosen_der)
print("Found optimum: ", res_scipy.x)

Documentation

Quickstart

In this tutorial, we will show how to solve a famous optimization problem, minimizing the Rosenbrock function, in simplenlopt. First, let’s define the Rosenbrock function and plot it:

\[f(x, y) = (1-x)^2+100(y-x^2)^2\]
[4]:
import numpy as np

def rosenbrock(pos):

    x, y = pos
    return (1-x)**2 + 100 * (y - x**2)**2

xgrid = np.linspace(-2, 2, 500)
ygrid = np.linspace(-1, 3, 500)

X, Y = np.meshgrid(xgrid, ygrid)

Z = (1 - X)**2 + 100 * (Y -X**2)**2

x0=np.array([-1.5, 2.25])
f0 = rosenbrock(x0)

#Plotly not rendering correctly on Readthedocs, but this shows how it is generated! Plot below is a PNG export
import plotly.graph_objects as go
fig = go.Figure(data=[go.Surface(z=Z, x=X, y=Y, cmax = 10, cmin = 0, showscale = False)])
fig.update_layout(
    scene = dict(zaxis = dict(nticks=4, range=[0,10])))
fig.add_scatter3d(x=[1], y=[1], z=[0], mode = 'markers', marker=dict(size=10, color='green'), name='Optimum')
fig.add_scatter3d(x=[-1.5], y=[2.25], z=[f0], mode = 'markers', marker=dict(size=10, color='black'), name='Initial guess')
fig.show()

The Rosenbrock optimization problem

The crux of the Rosenbrock function is that the minimum indicated by the green dot is located in a very narrow, banana shaped valley with a small slope around the minimum. Local optimizers try to find the optimum by searching the parameter space starting from an initial guess. We place the initial guess shown in black on the other side of the banana.

In simplenlopt, local optimizers are called by the minimize function. It requires the objective function and a starting point. The algorithm is chosen by the method argument. Here, we will use the derivative-free Nelder-Mead algorithm. Objective functions must be of the form f(x, ...) where x represents a numpy array holding the parameters which are optimized.

[5]:
import simplenlopt

def rosenbrock(pos):

    x, y = pos
    return (1-x)**2 + 100 * (y - x**2)**2

res = simplenlopt.minimize(rosenbrock, x0, method = 'neldermead')
print("Position of optimum: ", res.x)
print("Function value at Optimum: ", res.fun)
print("Number of function evaluations: ", res.nfev)
Position of optimum:  [0.99999939 0.9999988 ]
Function value at Optimum:  4.0813291938703456e-13
Number of function evaluations:  232

The optimization result is stored in a class whose main attributes are the position of the optimum and the function value at the optimum. The number of function evaluations is a measure of performance: the less function evaluations are required to find the minimum, the faster the optimization will be.

Next, let’s switch to a derivative based solver. For better performance, we also supply the analytical gradient which is passed to the jac argument.

[6]:
def rosenbrock_grad(pos):

    x, y = pos
    dx = 2 * x -2 - 400 * x * (y-x**2)
    dy = 200 * (y-x**2)

    return dx, dy

res_slsqp = simplenlopt.minimize(rosenbrock, x0, method = 'slsqp', jac = rosenbrock_grad)
print("Position of optimum: ", res_slsqp.x)
print("Function value at Optimum: ", res_slsqp.fun)
print("Number of function evaluations: ", res_slsqp.nfev)
Position of optimum:  [1. 1.]
Function value at Optimum:  1.53954903146237e-20
Number of function evaluations:  75

As the SLSQP algorithm can use gradient information, it requires less function evaluations to find the minimum than the derivative-free Nelder-Mead algorithm.

Unlike vanilla NLopt, simplenlopt automatically defaults to finite difference approximations of the gradient if it is not provided:

[7]:
res = simplenlopt.minimize(rosenbrock, x0, method = 'slsqp')
print("Position of optimum: ", res.x)
print("Function value at Optimum: ", res.fun)
print("Number of function evaluations: ", res.nfev)
Position of optimum:  [0.99999999 0.99999999]
Function value at Optimum:  5.553224195710645e-17
Number of function evaluations:  75
C:\Users\danie\.conda\envs\simplenlopt_env\lib\site-packages\simplenlopt\_Core.py:178: RuntimeWarning: Using gradient-based optimization, but no gradient information is available. Gradient will be approximated by central difference. Consider using a derivative-free optimizer or supplying gradient information.
  warn('Using gradient-based optimization'

As the finite differences are not as precise as the analytical gradient, the found optimal function value is higher than with analytical gradient information. In general, it is aways recommended to compute the gradient analytically or by automatic differentiation as the inaccuracies of finite differences can result in wrong results and bad performance.

For demonstration purposes, let’s finally solve the problem with a global optimizer. Like in SciPy, each global optimizer is called by a dedicated function such as crs() for the Controlled Random Search algorithm. Instead of a starting point, the global optimizers require a region in which they seek to find the minimum. This region is provided as a list of (lower_bound, upper_bound) for each coordinate.

[8]:
bounds = [(-2., 2.), (-2., 2.)]
res_crs = simplenlopt.crs(rosenbrock, bounds)
print("Position of optimum: ", res_crs.x)
print("Function value at Optimum: ", res_crs.fun)
print("Number of function evaluations: ", res_crs.nfev)
Position of optimum:  [1.00000198 1.00008434]
Function value at Optimum:  6.45980535135501e-07
Number of function evaluations:  907

Note that using a global optimizer is overkill for a small problem like the Rosenbrock function: it requires many more function evaluations than a local optimizer. Global optimization algorithms shine in case of complex, multimodal functions where local optimizers converge to local minima instead of the global minimum. Check the Global Optimization page for such an example.

Constrained optimization

Bound constraints

Often the parameters of an optimization problems are subject to (often abbreviated as s. t.) constraints. In practice, constraints make solving an optimization problem much more difficult. In the simplest case, these can be simple box constraints for the parameters.

Consider the Rosenbrock function with box constraints:

\[\begin{split}\begin{align} \underset{x, y}{argmin} (1-x)^2+100(y-x^2)^2 \\ \text{s. t. } x\ \in [-1, 0.5],\ y\in [0.5 , 1.5] \end{align}\end{split}\]

Bounds have to be provided as a list of (lower_bound, upper_bound) for each coordinate:

[1]:
import simplenlopt
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Patch

def rosenbrock(pos):

    x, y = pos
    return (1-x)**2 + 100 * (y - x**2)**2

x0 = np.array([-0.5, 0.6])
bounds=[(-1, 0.5), (0.5, 1.5)]
res = simplenlopt.minimize(rosenbrock, x0, bounds = bounds)
print(res.x)

X = np.linspace(-2, 2, 500)
Y = np.linspace(-2, 2, 500)

XX, YY = np.meshgrid(X, Y)
Z = (1-XX)*(1-XX)+100*(YY-XX*XX)**2

fig, ax = plt.subplots()
plt.pcolormesh(XX, YY, Z, vmin = 0, vmax = 10)
box = plt.Rectangle((-1, 0.5),1.5, 1, facecolor='r', edgecolor='r', lw = 2, alpha = 0.3)
plt.colorbar(extend='max', label="$f(x, y)$")
ax.add_patch(box)
ax.scatter([res.x[0]], [res.x[1]], c='r', s = 15, label='Optimum')
handles, labels = ax.get_legend_handles_labels()
legend_patch = Patch(facecolor='red', edgecolor='r', alpha = 0.3,
                         label='Feasible region')
handles.append(legend_patch)
ax.legend(handles = handles)
ax.set_xlabel("x")
ax.set_ylabel("y")
plt.show()
[-0.69845641  0.5       ]
<ipython-input-1-b838b1401c57>:23: MatplotlibDeprecationWarning: shading='flat' when X and Y have the same dimensions as C is deprecated since 3.3.  Either specify the corners of the quadrilaterals with X and Y, or pass shading='auto', 'nearest' or 'gouraud', or set rcParams['pcolor.shading'].  This will become an error two minor releases later.
  plt.pcolormesh(XX, YY, Z, vmin = 0, vmax = 10)
_images/Constrained_Optimization_1_2.png

Inequality and equality constraints

General inequality and equality constraints reduce the parameter space by linear or nonlinear functions. Let’s say we are interested in the minimum of the rosenbrock function which fulfills that \(y\geq 0.5-x\) and \(y\leq 1-x^2\). In typical optimization notation, constraints are formulated by defining a function of the optimization parameters \(f(\mathbf{x})\leq 0\) for inequality constraints and \(f(\mathbf{x})=0\) for equality constraints. In this notation, our constrained Rosenbrock problem would be written as

\[\begin{split}\begin{align} \underset{x, y}{argmin} (1-x)^2 & +100(y-x^2)^2 \\ s.\ t.\ 0.5 -x -y &\leq 0\\ y+x^2-1 &\leq 0 \end{align}\end{split}\]

In SciPy/simplenlopt style, a constraint has to be provided as a dictionary with at least two keys: const={type:'ineq'/'eq', 'fun'}. 'fun' must be of the form fun(x, *args) just like the objective function. All constraints then have to be passed as a list to minimize.

[2]:
def rosenbrock(pos):

    x, y = pos
    return (1-x)**2 + 100 * (y - x**2)**2

def constraint_1(pos):
    x, y = pos
    return 0.5-x-y

def constraint_2(pos):
    x, y = pos
    return y+x*x -1

ineq_constraint_1 = {'type':'ineq', 'fun':constraint_1}
ineq_constraint_2 = {'type':'ineq', 'fun':constraint_2}

x0 = np.array([0.5, 0.5])
res = simplenlopt.minimize(rosenbrock, x0, constraints=[ineq_constraint_1, ineq_constraint_2])
print("Found optimum position: ", res.x)
print("Number of function evaluations: ", res.nfev)

#Plot the feasiable solution space and the found minimum
fig, ax = plt.subplots()
plt.pcolormesh(XX, YY, Z, vmin = 0, vmax = 10, alpha = 0.5)

y_constraint_1 = 0.5-X
y_constraint_2 = 1 - X*X

constraint_1_region_lower_bound = np.maximum(y_constraint_1, y_constraint_2)
constraint_2_region_upper_bound = np.minimum(y_constraint_1, y_constraint_2)

ax.fill_between(X, constraint_1_region_lower_bound, np.full_like(X, 2), color='green', edgecolor = 'green', alpha=0.1)
ax.fill_between(X, constraint_2_region_upper_bound, np.full_like(X, -2), color='blue', edgecolor = 'blue', alpha=0.45)

ax.fill_between(X, y_constraint_1, y_constraint_2, where=(y_constraint_2 > y_constraint_1), color='r', alpha=0.2)#where=(y1 > y2)
plt.colorbar(extend='max', label="$f(x, y)$")
ax.scatter([res.x[0]], [res.x[1]], c='r', s = 15, zorder=2, label='Optimum')
handles, labels = ax.get_legend_handles_labels()
legend_patch_feasible = Patch(facecolor='red', edgecolor='r', alpha = 0.3,
                         label='Feasible region')
legend_patch_constraint1 = Patch(facecolor='green', edgecolor='g', alpha = 0.3,
                         label='$y\geq 0.5-x$')
legend_patch_constraint2 = Patch(facecolor='blue', edgecolor='blue', alpha = 0.3,
                         label='$y\leq 1-x^2$')
handles.extend((legend_patch_feasible, legend_patch_constraint1, legend_patch_constraint2))
ax.set_xlim(-2, 2)
ax.set_ylim(-2, 2)
ax.legend(handles = handles, loc='lower center')
ax.set_xlabel("x")
ax.set_ylabel("y")
plt.show()
Found optimum position:  [0.70747223 0.49948304]
Number of function evaluations:  273
<ipython-input-2-b68453b802b1>:23: MatplotlibDeprecationWarning: shading='flat' when X and Y have the same dimensions as C is deprecated since 3.3.  Either specify the corners of the quadrilaterals with X and Y, or pass shading='auto', 'nearest' or 'gouraud', or set rcParams['pcolor.shading'].  This will become an error two minor releases later.
  plt.pcolormesh(XX, YY, Z, vmin = 0, vmax = 10, alpha = 0.5)
_images/Constrained_Optimization_3_2.png

Augmented Lagrangian

Many optimization algorithms were originally not developed to deal with constraints. NLopt provides a powerful way around this: the augmented Lagrangian. Similarly to regularization in machine learning, the augmented lagrangian adds increasing penalty terms to penalize violation of the constraints until they are met. This makes it possible to also use algorithms which were originally not designed to handle constraints. In simplenlopt, the minimize function automatically calls the augmented lagrangian if an optimizer cannot deal with the supplied constraints. Here, we will use the BOBYQA algorithm for the same constrained Rosenbrock problem.

[3]:
res = simplenlopt.minimize(rosenbrock, x0, method = 'bobyqa', constraints=[ineq_constraint_1, ineq_constraint_2])
print("Found optimum position: ", res.x)
print("Number of function evaluations: ", res.nfev)
Found optimum position:  [0.70747058 0.49948457]
Number of function evaluations:  753
C:\Users\danie\.conda\envs\simplenlopt_env\lib\site-packages\simplenlopt\_Core.py:479: RuntimeWarning: Method bobyqa does not support constraints. Constraints will be handled by augmented lagrangian. In case of problems consider method='COBYLA'.
  warn("Method {} does not support constraints. "

The augmented lagrangian can also be directly called with by the function ‘auglag’ which uses the same arguments as minimize. As auglag does not apply the same checks as minimize, calling it directly reduces some small overhead.

[4]:
res = simplenlopt.auglag(rosenbrock, x0, method = 'bobyqa', constraints=[ineq_constraint_1, ineq_constraint_2])
print(res.x)
[0.70747058 0.49948457]

Note that the augmented lagrangian is only implemented for the local optimizers but not for the global optimizers!

Global Optimization

Unlike local optimization algorithms which start from one point and typically a local minimum, global optimization algorithms seek to find the best minimum in a defined region. This means that they typically have a much higher chance to find the global optimum for functions with several local minima but require many more evaluations of the objective. As an example, consider the Styblinski-Tang function:

\[f(x, y)= 0.5\cdot (x^4-16x^2+5x + y^4-16y^2+5y)\]

This function has four local minima, one of which is a global minimum.

For such low-dimensional problems with a small number of local minima, the deterministic DIviding RECTangles algorithm (DIRECT) is often a good candidate. We will compare the performance of DIRECT and the default derivate free local optimizer in simplenlopt, BOBYQA, initiated at (0, 0). Note that global optimizers do not require an initial guess x0 but bounds in the form [(min, max), (min, max), ...].

[2]:
import simplenlopt
import plotly.graph_objects as go
import numpy as np

def styblinski_tang(pos):

    x, y = pos
    return 0.5 * (x**4 -16*x**2 + 5*x + y**4 - 16 * y**2 + 5 * y)

bounds = [(-4, 4), (-4, 4)]

x0 = np.array([0, 0])
res_bobyqa = simplenlopt.minimize(styblinski_tang, x0, bounds = bounds)
print("BOBYQA Optimum position: ", res_bobyqa.x)
print("BOBYQA Optimum value: ", res_bobyqa.fun)
print("BOBYQA Function evaluations: ", res_bobyqa.nfev)

res_direct = simplenlopt.direct(styblinski_tang, bounds)
print("DIRECT Optimum position: ", res_direct.x)
print("DIRECT Optimum value: ", res_direct.fun)
print("DIRECT Function evaluations: ", res_direct.nfev)

#Plot the function and the found minima
X = np.linspace(-4, 4, num = 100)
Y = np.linspace(-4, 4, num = 100)
BOBYQA Optimum position:  [-2.90353718  2.74681255]
BOBYQA Optimum value:  -64.19561235748431
BOBYQA Function evaluations:  28
DIRECT Optimum position:  [-2.90348693 -2.90362242]
DIRECT Optimum value:  -78.33233123410332
DIRECT Function evaluations:  255

Test

[5]:
#Plotly code to produce the above image. Unfortunately not rendering correctly on readthedocs.

XX, YY = np.meshgrid(X, Y)
F = 0.5 * (XX**4 -16*XX**2 + 5*XX + YY**4 - 16 * YY**2 + 5 * YY)

fig = go.Figure(data=[go.Surface(z=F, x=XX, y=YY, showscale = False)])
#fig.update_layout(
#    scene = dict(zaxis = dict(nticks=4, range=[0,10])))
fig.add_scatter3d(x=[res_bobyqa.x[0]], y=[res_bobyqa.x[1]], z=[res_bobyqa.fun], mode = 'markers', marker=dict(size=10, color='green'), name='BOBYQA')
fig.add_scatter3d(x=[res_direct.x[0]], y=[res_direct.x[1]], z=[res_direct.fun], mode = 'markers', marker=dict(size=10, color='black'), name='DIRECT')
camera = dict(
    eye=dict(x=-1.5, y=-1.5, z=0)
)

fig.update_layout(scene_camera=camera)
fig.show()

In depth: Gradients

This tutorial shows how to supply gradient information about an objective to simplenlopt in SciPy or NLopt style. One example for modern automatic differentiation via the external package autograd is also included. The studied optimization problem is again the Rosenbrock function. Its objective and partial derivatives are given by

\[\begin{split}\begin{align} f(x, y) & =(1-x)^2+100(y-x^2)^2\\ \frac{\partial f}{\partial x} &=-2(1-x)-400x(y-x^2) \\ \frac{\partial f}{\partial y} &=200(y-x^2) \end{align}\end{split}\]

jac=callable

The easiest case which is also shown in the quickstart example. Objective and gradient are supplied as two individual functions:

[1]:
import numpy as np
import simplenlopt
from time import time

def rosenbrock(pos):

    x, y = pos
    return (1-x)**2 + 100 * (y - x**2)**2

def rosenbrock_grad(pos):

    x, y = pos
    dx = 2 * x -2 - 400 * x * (y-x**2)
    dy = 200 * (y-x**2)

    return np.array([dx, dy])

x0=np.array([-1.5, 2.25])
%timeit -n 1000 res = simplenlopt.minimize(rosenbrock, x0, jac = rosenbrock_grad)
2.33 ms ± 51.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

jac = True

Taking another look at the objective and its partial derivaties you can see that the expression in the brackets appear in both the objective and the partial derivatives. If both are calculated in individual functions, these terms are unnecessarily recomputed. This can be avoided by supplying both the objective and its gradient in one function. That the objective also contains the gradient information, is indicated by setting jac=True. Let’s see how this works and how it affects the performance.

[2]:
def rosenbrock_incl_grad(pos):

    x, y = pos
    first_bracket = 1-x
    second_bracket = y-x*x

    obj = first_bracket*first_bracket+100*second_bracket*second_bracket
    dx = -2*first_bracket-400*x*second_bracket
    dy = 200 * second_bracket

    return obj, np.array([dx, dy])

%timeit -n 1000 res = simplenlopt.minimize(rosenbrock_incl_grad, x0, jac = True)
2.25 ms ± 44.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

We see a small performance improvement. For more complicated objective functions, the performance gains can be massive though it repeated computations can be avoided.

jac=’nlopt’

This flag is mostly for former NLopt users. It indicates that the objective and its gradient are supplied in vanilla NLopt style. NLopt requires another signature for the objective: f(x, grad) instead of f(x). The gradient given by grad is given by a NumPy array which must be replaced in-place. For the Rosenbrock example this looks like:

[3]:
def rosenbrock_nlopt_style(pos, grad):

    x, y = pos
    first_bracket = 1-x
    second_bracket = y-x*x

    obj = first_bracket*first_bracket+100*second_bracket*second_bracket

    if grad.size > 0:

        dx = -2*first_bracket-400*x*second_bracket
        dy = 200 * second_bracket
        grad[0] = dx
        grad[1] = dy

    return obj

%timeit -n 1000 res = simplenlopt.minimize(rosenbrock_nlopt_style, x0, jac = 'nlopt')
2.14 ms ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

The in-place replacement led to another small performance gain. Side note: while the if statement might seem weird and unnecessary, it is required for some of the optimizers, so you are on the safe side if you include it in your objective function .

jac = ‘3-point’/’2-point’

These flags tell simplenlopt which finite difference scheme to use. Finite differencing is borrowed from SciPy. Note that ‘2-point’ requires less function evaluations but is less precise and therefore more prone to cause optimization failures.

[4]:
%timeit -n 100 res = simplenlopt.minimize(rosenbrock, x0, jac = '3-point')
7.39 ms ± 245 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

This example shows that finite differences are not competitive against analytical gradients. For simple cases such as low dimensional curve fitting they are often still useful. If possible, automatic differentiation represents a powerful alternative.

Autodiff

In recent years, automatic differentiation (autodiff) has become one of the building blocks of machine learning. Many frameworks such as pytorch and tensorflow actually are centered around autodiff. Here, we will use the slightly older autograd package to automatigally compute the gradient for us and feed it to simplenlopt.

[5]:
import autograd.numpy as anp
from autograd import value_and_grad

rosen_and_grad = value_and_grad(rosenbrock)
%timeit -n 10 res = simplenlopt.minimize(rosen_and_grad, x0, jac = True)
17.3 ms ± 392 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

While autograd results in the worst performance for this example, autodiff shines when it comes to high dimensional problems where the inaccuracies of finite differences are much more severe. To circumvent autograd’s performance issues, another candidate could be for example autograd’s succesor jax which additionally provides just-in-time compilation.

Curve Fitting

NLopt does not contain dedicated algorithms for curve fitting but any nonlinear optimization algorithm can be used for that purpose. The curve_fit function is a simple convenience function to make curve fitting more straightforward. As for general local and global optimization, the API mimics SciPy to make testing different algorithms as simple as possible. Unlike SciPy, all optimization algorithms available in simplenlopt can be employed to perform the fitting.

Fitting a simple model

In this tutorial we will recreate SciPy’s curve fitting example.

The model to be fitted has to be supplied as f(x, param_1, param_2, ... param_n) where x denotes the independent variable and the other variables the model parameters we want to estimate. The three-parameter model to be fitted is given by an exponential decay plus an offset:

\[\begin{equation} f(x|a, b, c)=c + a\cdot e^{-b\cdot x} \end{equation}\]

Least squares fitting means that the sum of the squared differences between the predictions made by the model and the observed data is minimized. In this case:

\begin{equation} \underset{a, b, c}{\mathrm{argmin}}\sum_i \bigl(f(x_i|a, b, c)-y_i\bigl)^2 \end{equation}

First, we set ground truth parameters, generate a model prediction and add some gaussian noise.

[1]:
import numpy as np
import simplenlopt
import matplotlib.pyplot as plt
[2]:
def model(x, a, b, c):

    return a * np.exp(-b * x) + c

xdata = np.linspace(0, 4,50)
a_true = 2.
b_true = 0.8
c_true = 2.
y = model(xdata, a_true, b_true, c_true)
y_data = y + 0.1 * np.random.normal(size=xdata.size)

plt.scatter(xdata, y_data, c = 'k')
plt.show()
_images/Curve_Fitting_3_0.png

Now we can fit the model using the curve_fit function. Its calling signature is the same as SciPy’s: its inputs are at least a Python function which calculates the model prediction and both the independent variable xdata and the observed data ydata. In case of successsfull optimization it returns the estimated model parameters and the model parameter covariance matrix. The square root of the covariance matrix’s diagonal yields an estimate of the model parameter errors.

[3]:
params, cov_matrix = simplenlopt.curve_fit(model, xdata, y_data)
print("Estimated model parameters: ", params)

std_errors = np.sqrt(np.diag(cov_matrix))
print("Model parameter standard errors: ", std_errors)

plt.scatter(xdata, y_data, c = 'k')
plt.plot(xdata, model(xdata, *params), c='r')
plt.show()
Estimated model parameters:  [2.07999203 0.86567334 2.02589507]
Model parameter standard errors:  [0.0049512  0.00579616 0.00388337]
_images/Curve_Fitting_5_1.png

Unlike scipy, simplenlopt will use a gradient-free algorithm (BOBYQA) if no gradient information is supplied. Gradient information must be supplied as an N x m numpy array for N data points and m parameters. Each column of the gradient matrix must contain the partial derivative with respect to one parameter for all data points. To make this less confusing, the code below shows how to supply the gradient for this example. The partial derivatives are

\[\begin{split}\begin{align} \frac{\partial f}{\partial a} &=e^{-b\cdot x} \\ \frac{\partial f}{\partial b} &=-x\cdot a\cdot e^{-b\cdot x} \\ \frac{\partial f}{\partial c} &=1 \end{align}\end{split}\]
[4]:
def jac_model(x, a, b, c):

    da = np.exp(-b * x)
    db = -x * a * np.exp(-b * x)
    dc = np.full_like(x, 1.)

    return np.stack((da.ravel(), db.ravel(), dc), axis = 1)

params, cov_matrix = simplenlopt.curve_fit(model, xdata, y_data, jac = jac_model)
print("Estimated model parameters: ", params)
Estimated model parameters:  [2.07996431 0.86570146 2.02591902]

Troubleshooting difficult fits

Next, let’s try a more complicated example to show how the choice of optimizers and starting guesses affects the fitting results. We will add a gaussian peak to the model to make it more complex:

\begin{equation} f(x|a, b, c, d, e, f)=c + a\cdot e^{-b\cdot x} +d\cdot \exp\left(\frac{-(e-x)^2}{f^2}\right) \end{equation}

Again, let’s first generate some fake data.

[5]:
def complex_model(x, a, b, c, d, e, f):

    return a * np.exp(-b * x) + c + d * np.exp(-(e-x)**2/f**2)

d_true = 1.5
e_true = 3.
f_true = 0.5

true_params = np.array([a_true, b_true, c_true, d_true, e_true, f_true])
y_complex = complex_model(xdata, *true_params)
y_noisy = y_complex + 0.1 * np.random.normal(size=xdata.size)
plt.scatter(xdata, y_noisy, c = 'k')
[5]:
<matplotlib.collections.PathCollection at 0x29db60c77f0>
_images/Curve_Fitting_9_1.png
Initial guess

By default, the initial guess for all model parameters is 1. This guess is far away from the true parameters in this example. A starting point far away from the true parameters also makes it hard for the optimizer to find the true parameters. To illustrate this, let’s plot the model prediction for the initial guesses and the parameters found by the default optimizer, BOBYQA.

[6]:
initial_guesses = np.array([1., 1., 1., 1., 1., 1.])
optimized_params, cov_matrix = simplenlopt.curve_fit(complex_model, xdata, y_noisy)
plt.scatter(xdata, y_noisy, c = 'k')
plt.plot(xdata, complex_model(xdata, *initial_guesses), c='b', label='Initial guess')
plt.plot(xdata, complex_model(xdata, *optimized_params), c='r', label='Optimized')
plt.legend()
plt.show()
<ipython-input-5-57e43c5c3f19>:3: RuntimeWarning: divide by zero encountered in true_divide
  return a * np.exp(-b * x) + c + d * np.exp(-(e-x)**2/f**2)
_images/Curve_Fitting_11_1.png

Ugh, obviously a bad fit. Let’s try different starting guesses: we guess the peak position to be at 2.5 and leave the other guesses at 1. The initial guess is passed to the p0 parameter.

[7]:
initial_guesses=np.array([1., 1., 1., 1., 2.5, 1.])
optimized_params, cov_matrix = simplenlopt.curve_fit(complex_model, xdata, y_noisy, p0 = initial_guesses)
plt.scatter(xdata, y_noisy, c = 'k')
plt.plot(xdata, complex_model(xdata, *initial_guesses), c='b', label='Initial guess')
plt.plot(xdata, complex_model(xdata, *optimized_params), c='r', label='Optimized')
plt.legend()
plt.show()
<ipython-input-5-57e43c5c3f19>:3: RuntimeWarning: divide by zero encountered in true_divide
  return a * np.exp(-b * x) + c + d * np.exp(-(e-x)**2/f**2)
_images/Curve_Fitting_13_1.png

This looks much better! Keep in mind that it is always better to supply an initial estimate for the model parameters. If your parameters are far away from 1, fitting will often not work with a local optimizer.

Choice of the optimization algorithm

That being said, it is also always a good idea to test different optimizers. We will try to fit the model using the default guesses of 1 for all parameters using the Method of Moving Asymptotes (MMA) and compare it with the BOBYQA result.

[8]:
params_bobyqa, bobyqa_cov = simplenlopt.curve_fit(complex_model, xdata, y_noisy, method = 'BOBYQA')
params_mma, mma_cov = simplenlopt.curve_fit(complex_model, xdata, y_noisy, method = 'MMA')
plt.scatter(xdata, y_noisy, c = 'k')
plt.plot(xdata, complex_model(xdata, *params_bobyqa), c='b', label='BOBYQA')
plt.plot(xdata, complex_model(xdata, *params_mma), c='r', label='MMA')
plt.legend()
plt.show()
<ipython-input-5-57e43c5c3f19>:3: RuntimeWarning: divide by zero encountered in true_divide
  return a * np.exp(-b * x) + c + d * np.exp(-(e-x)**2/f**2)
C:\Users\danie\.conda\envs\simplenlopt_env\lib\site-packages\simplenlopt\_Core.py:211: RuntimeWarning: Using gradient-based optimization, but no gradient information is available. Gradient will be approximated by central difference. Consider using a derivative-free optimizer or supplying gradient information.
  warn('Using gradient-based optimization'
_images/Curve_Fitting_15_1.png

Clearly, MMA outperforms BOBYQA in this case. As always with nonlinear optimization though, this does not have to be true for all models.

In case of very difficult fits, using a global optimizer can be helpful. All global optimizers available in simplenlopt can be called in curve_fit. Global optimizers require bounds on the parameters: bounds have to be given in the form ([lower_bound, lower_bound, ...], [upper_bound, upper_bound, ...]) as a two-tuple which contains a list of lower bounds and a list of upper bounds. Note that this is different from the way bounds are passed to minimize or the global optimizers: this choice was mandated by the requirement to resemble SciPy’s curve fitting API. Here, we will use the Controlled Random Search algorithm (method='crs'):

[9]:
bounds = ([0, 0, 0, 0, 0, 0], [5, 5, 5, 5, 5, 5])
params_crs, crs_cov = simplenlopt.curve_fit(complex_model, xdata, y_noisy, method = 'crs', bounds = bounds)
plt.scatter(xdata, y_noisy, c = 'k')
plt.plot(xdata, complex_model(xdata, *params_crs), c='r', label='CRS')
plt.legend()
<ipython-input-5-57e43c5c3f19>:3: RuntimeWarning: divide by zero encountered in true_divide
  return a * np.exp(-b * x) + c + d * np.exp(-(e-x)**2/f**2)
[9]:
<matplotlib.legend.Legend at 0x29db63103a0>
_images/Curve_Fitting_17_2.png

Robust fits for data with outliers

Least-squares fitting is prone to outliers in the data. The influence of outliers can be reduced by choosing a more robust loss function. simplenlopt contains three loss functions indicated by the loss argument: squared, absolute and cauchy. For a model with \(m\) parameters they are defined as

\begin{align} \text{'squared':} & \sum_i (f(x_i|p_1, ... , p_m)-y_i)^2\\ \text{'absolute':} & \sum_i |f(x_i|p_1, ... , p_m)-y_i|\\ \text{'cauchy':} & \sum_i f_{scale}^2\cdot \ln\left(1 + \frac{f(x_i|p_1, ... , p_m)-y_i)^2}{f_{scale}^2}\right) \end{align}

suared yields the typical least squares fit. absolute error is typically much more robust to outliers but complicates the optimization problem. cauchy enables to finetune the effect of outliers by varying the f_scale parameter: the smaller it is, the more robust the fit becomes.

In the following, we will compare the three different losses for fitting the exponential decay with outliers.

[13]:
plt.rcParams['figure.figsize'] = [12, 12]
plt.rcParams['font.size']= 16
y_data[30] +=3.
y_data[40] +=1.5
y_data[48] +=3.

params_cauchy_tuned, pcov_cauchy_tuned = simplenlopt.curve_fit(model, xdata, y_data, method='mma', jac=jac_model, loss = 'cauchy', f_scale = 0.2)
params_cauchy, pcov_cauchy = simplenlopt.curve_fit(model, xdata, y_data, method='mma', jac=jac_model, loss = 'cauchy')
params_absolute, pcov_absolute = simplenlopt.curve_fit(model, xdata, y_data, method='mma', jac=jac_model, loss = 'absolute')
params_squared, pcov_squared = simplenlopt.curve_fit(model, xdata, y_data, method='mma', jac=jac_model, loss = 'squared')

plt.scatter(xdata, y_data, c='k', s = 8)
plt.plot(xdata, model(xdata, *params_cauchy), c='r', lw = 2, label = 'Cauchy loss')
plt.plot(xdata, model(xdata, *params_cauchy_tuned), c='c', lw = 2, ls='--', label = 'Cauchy loss: f_scale = 0.2')
plt.plot(xdata, model(xdata, *params_squared), c='b', lw = 2, label = 'Squared error loss')
plt.plot(xdata, model(xdata, *params_absolute), c='y', lw = 2, label = 'Absolute error loss')
plt.plot(xdata, model(xdata, 2, 0.8, 2), c='k', lw = 2, ls=':', label = 'True')

plt.legend()
plt.show()
_images/Curve_Fitting_19_0.png

Whereas the squared error estimate (blue curve) is very much off, the other loss functions do much better. In many cases, absolute loss (yellow line) already yields a good fit. If possible, Cauchy loss enables to finetune the fit by testing out different values for f_scale: in this case, setting f_scale=0.2 (dashed cyan line)results in a smaller error compared to the default value of 1 (red line).

API reference

class simplenlopt.OptimizeResult

Bases: dict

Represents the optimization result.

x

The solution of the optimization.

Type

ndarray

success

Whether or not the optimizer exited successfully.

Type

bool

message

Description of the cause of the termination.

Type

str

fun

Value of the objective function.

Type

float

nfev

Number of evaluations of the objective function.

Type

int

simplenlopt.auglag(fun, x0, args=(), method='auto', jac=None, bounds=None, constraints=(), penalize_inequalities=True, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Constrained local minimization via the augmented lagrangian method

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • x0 (ndarray) – Starting guess for the decision variable

  • args (tuple, optional, default ()) – Further arguments to describe the objective function

  • method (string, optional, default 'auto') –

    Optimization algorithm to use. Should be one of

    • ’lbfgs’: Limited-memory Broyden-Fletcher Goldfarb Shanno algorithm

    • ’slsqp’: Sequential least squares programming

    • ’mma’: Method of moving asymptotes

    • ’ccsaq’: conservative convex separable approximation

    • ’tnewton’: truncated Newton

    • ’tnewton_restart’: truncated Newton with restarting

    • ’tnewton_precond’: truncated Newton with preconditioning

    • ’tnewton_precond_restart’: truncated Newton with preconditioning and restarting

    • ’var1’: Shifted limited-memory variable metric with rank 1-method

    • ’var2’: Shifted limited-memory variable metric with rank 2-method

    • ’bobyqa’: Bounded optimization by quadratic approximation

    • ’cobyla’: Constrained optimization by linear approximation

    • ’neldermead’: Nelder-Mead optimization

    • ’sbplx’: Subplex algorithm

    • ’praxis’: Principal Axis algorithm

    • ’auto’

    See NLopt documentation for a detailed description of these methods.

    If ‘auto’, a suitable solver is chosen based on the availability of gradient information and if also inequalities should be penalized:

    jac != None and penalize_inequalities=True -> ‘lbfgs’

    jac != None and penalize_inequalities=False -> ‘mma’

    jac = None and penalize_inequalities=True -> ‘bobyqa’

    jac = None and penalize_inequalities=False -> ‘cobyla’

  • jac ({callable, '2-point', '3-point', 'NLOpt', bool}, optional, default None) –

    If callable, must be in the form jac(x, *args), where x is the argument. in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

    If ‘2-point’ will use forward difference to approximate the gradient.

    If ‘3-point’ will use central difference to approximate the gradient.

    If ‘NLOpt’, must be in the form jac(x, grad, *args), where x is the argument in the form of a 1-D array, grad a 1-D array containing the gradient and args is a tuple of any additional fixed parameters needed to completely specify the function.

  • bounds (tuple of array-like, optional, default None) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x).

  • constraints (list, optional, default ()) – List of constraint functions. Constraints must be of the form f(x) for a constraint of the form f(x) <= 0.

  • penalize_inequalities (bool, optional, default True) –

    If True, also penalizes violation of inequality constraints (NLopt code: AUGLAG).

    If False, only penalizes violation of equality constraints (NLopt code: AUGLAG_EQ).

    In this case the chosen method must be able to handle inequality constraints.

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval (int, optional, default None) – Number of maximal function evaluations.

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

Andrew R. Conn, Nicholas I. M. Gould, and Philippe L. Toint, “A globally convergent augmented Lagrangian algorithm for optimization with general constraints and simple bounds,” SIAM J. Numer. Anal. vol. 28, no. 2, p. 545-572 (1991)

E. G. Birgin and J. M. Martínez, “Improving ultimate convergence of an augmented Lagrangian method,” Optimization Methods and Software vol. 23, no. 2, p. 177-195 (2008)

The augmented lagrangian transforms a constrained optimization problem into an unconstrained one by constructing penalty terms for violating the constraints.

simplenlopt.crs(fun, bounds, args=(), x0='random', population=None, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Global optimization via Controlled Random Search with local mutation

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • x0 ({ndarray, 'random'}, optional, default 'random') – Initial parameter vector guess. If ndarray, must be a 1-D array if ‘random’, picks a random initial guess in the feasible region.

  • population (int, optional, default None) – Population size. If None, will use NLopt’s default population size.

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

P. Kaelo and M. M. Ali, “Some variants of the controlled random search algorithm for global optimization,” J. Optim. Theory Appl. 130 (2), 253-264 (2006)

W. L. Price, “Global optimization by controlled random search,” J. Optim. Theory Appl. 40 (3), p. 333-348 (1983)

W. L. Price, “A controlled random search procedure for global optimization,” in Towards Global Optimization 2, p. 71-84 edited by L. C. W. Dixon and G. P. Szego (North-Holland Press, Amsterdam, 1978)

simplenlopt.curve_fit(f, xdata, ydata, p0=None, sigma=None, bounds=None, loss='squared', method='auto', jac=None, f_scale=1, **minimize_kwargs)

Curve fitting using NLopt’s local optimizers in SciPy style

Parameters
  • f (callable) – Must be of the form f(xdata, *params)

  • xdata (ndarray (n, )) – Predictor variable values

  • ydata (ndarray (n, )) – Response variable values

  • p0 (ndarray (n, ), optional) – If None, defaults to 1 for all parameters

  • sigma (ndarray (n, ), optional) – Typically uncertainties for each data point. The objective will be multiplied by 1/sigma

  • bounds (two-tuple of array-like, optional) – Determines the bounds on the fitting parameters ([lower_bounds], [upper_bounds])

  • loss (string, optional, default 'linear') –

    Should be one of

    • ’squared’: minimizes sum(residual**2)

    • ’absolute’: minimizes sum(abs(residual))

    • ’cauchy’: minimizes sum(f_scale**2 * ln(1 + residual**2/f_scale**2))

  • method (string or 'auto', optional, default 'auto') –

    Optimization algorithm to use.

    Local optimizers:

    • ’lbfgs’

    • ’slsqp’

    • ’mma’

    • ’ccsaq’

    • ’tnewton’

    • ’tnewton_restart’

    • ’tnewton_precond’

    • ’tnewton_precond_restart’

    • ’var1’

    • ’var2’

    • ’bobyqa’

    • ’cobyla’

    • ’neldermead’

    • ’sbplx’

    • ’praxis’

    Global optimizers require bounds!=None. Possible algorithms:

    • ’crs’

    • ’direct’

    • ’esch’

    • ’isres’

    • ’stogo’

    • ’mlsl’

    If ‘auto’, defaults to ‘slsqp’ if jac != None and ‘bobyqa’ if jac == None

  • jac (callable, optional) – Must be of the form jac(xdata) and return a N x m numpy array for N data points and m fitting parameters

Returns

  • popt (ndarray (m, )) – Array of best fit parameters.

  • pcov (ndarray (m, m)) – Approximated covariance matrix of the fit parameters. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)). In general, these are not very accurate estimates of the parameter uncertainty as the method relies on approximations.

Raises

ValueError – if either ydata or xdata contain NaNs, or if incompatible options are used.

simplenlopt.direct(fun, bounds, args=(), locally_biased=True, scale=True, randomize=False, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval=None, maxtime=None, solver_options={})

Global optimization via variants of the DIviding RECTangles (DIRECT) algorithm

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • locally_biased (boolean, optional, default True) – If True, uses the locally biased variant of DIRECT known as DIRECT_L

  • scale (boolean, optional, default True) – If True, scales the parameter space to a hypercube of length 1 in all dimensions

  • randomize (boolean, optional, default False) – If True, randomize the algorithm by partly randomizing which side of the hyperrectangle is halved

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

D. R. Jones, C. D. Perttunen, and B. E. Stuckmann, “Lipschitzian optimization without the lipschitz constant,” J. Optimization Theory and Applications, vol. 79, p. 157 (1993)

J. M. Gablonsky and C. T. Kelley, “A locally-biased form of the DIRECT algorithm,” J. Global Optimization, vol. 21 (1), p. 27-37 (2001)

By default will use the locally biased variant with NLopt code “DIRECT_L”. For objective functions with many local minima, setting locally_biased=False which calls the DIRECT algorithm without local bias is advisable.

simplenlopt.esch(fun, bounds, args=(), x0='random', population=None, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Global optimization via Differential Evolution variant

Note

ESCH does not seem to respect the relative and absolute convergence criteria. By default, it will always run for the maximal number of iterations.

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • x0 ({ndarray, 'random'}, optional, default 'random') –

    Initial parameter vector guess.

    If ndarray, must be a 1-D array.

    if ‘random’, picks a random initial guess in the feasible region.

  • population (int, optional, default None) – Population size. If None, will use NLopt’s default population size.

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

Reference: C. H. da Silva Santos, “Parallel and Bio-Inspired Computing Applied to Analyze Microwave and Photonic Metamaterial Strucutures,” Ph.D. thesis, University of Campinas, (2010)

simplenlopt.global_optimizers_info()

Prints a short summary of NLopt’s global optimizers

simplenlopt.isres(fun, bounds, args=(), constraints=[], x0='random', population=None, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Global optimization via the Improved Stochastic Ranking Evolution Strategy

Note

ISRES does not seem to respect the relative and absolute convergence criteria. By default, it will always run for the maximal number of iterations.

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • constraints (list, optional, default ()) – List of constraint functions. Constraints must be of the form f(x) for a constraint of the form f(x) <= 0.

  • x0 ({ndarray, 'random'}, optional, default 'random') – Initial parameter vector guess. If ndarray, must be a 1-D array if ‘random’, picks a random initial guess in the feasible region.

  • population (int, optional, default None) – Population size. If None, will use NLopt’s default population size.

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

Thomas Philip Runarsson and Xin Yao, “Search biases in constrained evolutionary optimization,” IEEE Trans. on Systems, Man, and Cybernetics Part C: Applications and Reviews, vol. 35 (no. 2), pp. 233-243 (2005)

Thomas P. Runarsson and Xin Yao, “Stochastic ranking for constrained evolutionary optimization,” IEEE Trans. Evolutionary Computation, vol. 4 (no. 3), pp. 284-294 (2000)

simplenlopt.local_optimizers_info()

Prints a short summary of NLopt’s local optimizers

simplenlopt.minimize(fun, x0, args=(), method='auto', jac=None, bounds=None, constraints=[], ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Local minimization function for NLopt’s algorithm in SciPy style

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • x0 (ndarray) – Starting guess for the decision variable

  • args (tuple, optional, default ()) – Further arguments to describe the objective function

  • method (string, optional, default 'auto') –

    Optimization algorithm to use. Should be one of

    • ’lbfgs’: Limited-memory Broyden-Fletcher Goldfarb Shanno algorithm

    • ’slsqp’: Sequential least squares programming

    • ’mma’: Method of moving asymptotes

    • ’ccsaq’: conservative convex separable approximation

    • ’tnewton’: truncated Newton

    • ’tnewton_restart’: truncated Newton with restarting

    • ’tnewton_precond’: truncated Newton with preconditioning

    • ’tnewton_precond_restart’: truncated Newton with preconditioning and restarting

    • ’var1’: Shifted limited-memory variable metric with rank 1-method

    • ’var2’: Shifted limited-memory variable metric with rank 2-method

    • ’bobyqa’: Bounded optimization by quadratic approximation

    • ’cobyla’: Constrained optimization by linear approximation

    • ’neldermead’: Nelder-Mead optimization

    • ’sbplx’: Subplex algorithm

    • ’praxis’: Principal Axis algorithm

    • ’auto’

    See NLopt documentation for a detailed description of these methods.

    If ‘auto’, will be set to ‘bobyqa’/’cobyla’ if jac=None (‘cobyla’ if constraints are set) or ‘lbfgs’/’slsqp’ if jac != None (‘slsqp’ if constraints set).

    If the chosen method does not support the required constraints, the augmented lagrangian is called to handle them.

  • jac ({callable, '2-point', '3-point', 'NLOpt', bool}, optional, default None) –

    If callable, must be in the form jac(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

    If ‘2-point’ will use forward difference to approximate the gradient.

    If ‘3-point’ will use central difference to approximate the gradient.

    If ‘NLOpt’, must be in the form jac(x, grad, *args), where x is the argument in the form of a 1-D array, grad a 1-D array containing the gradient and args is a tuple of any additional fixed parameters needed to completely specify the function.

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x).

  • constraints (list, optional, default ()) – List of constraint functions. Constraints must be of the form f(x) for a constraint of the form f(x) <= 0.

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval (int, optional, default None) – Number of maximal function evaluations.

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Examples

>>> import numpy as np
>>> from scipy.optimize import rosen, rosen_der
>>> x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> res = minimize(rosen, x0, method='lbfgs', jac=rosen_der)
>>> res.success
True
>>> res.message
'Success'
>>> np.isclose(res.fun, 0)
True
>>> res.x
array([ 1.,  1.,  1.,  1.,  1.])
>>> res = minimize(rosen, x0, method='lbfgs', jac=rosen_der,
...                ftol_abs=1e-5)
>>> res.success
True
>>> res.message
'Optimization stopped because ftol_rel or ftol_abs (above) was reached.'
>>> res = minimize(rosen, x0, method='ld_lbfgs', jac=rosen_der, foo=3)
Traceback (most recent call last):
    ...
ValueError: Parameter foo could not be recognized.
>>> x0 = np.array([-1., 1.])
>>> fun = lambda x: - 2*x[0]*x[1] - 2*x[0] + x[0]**2 + 2*x[1]**2
>>> dfun = lambda x: np.array([2*x[0] - 2*x[1] - 2, - 2*x[0] + 4*x[1]])
>>> cons = [{'type': 'eq',
...           'fun': lambda x: x[0]**3 - x[1],
...           'jac': lambda x: np.array([3.*(x[0]**2.), -1.])},
...         {'type': 'ineq',
...           'fun': lambda x: x[1] - 1,
...           'jac': lambda x: np.array([0., 1.])}]
>>> res = minimize(fun, x0, jac=dfun, method='slsqp', constraints=cons)
>>> res.success
False
>>> res.message
'Halted because roundoff errors limited progress. (In this case, the optimization still typically returns a useful result.)'
>>> res.x.round(2)
array([ 0.84,  0.6 ])
simplenlopt.mlsl(fun, bounds, args=(), jac=None, x0='random', sobol_sampling=True, population=4, local_minimizer='auto', ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, local_minimizer_options={}, solver_options={})

Global optimization via MultiLevel Single Linkage (MLSL)

Note

MLSL does not seem to respect the relative and absolute convergence criteria. By default, it will always run for the maximal number of iterations.

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • jac ({callable, '2-point', '3-point', 'NLOpt', bool}, optional, default None) –

    If callable, must be in the form jac(x, *args), where x is the argument. in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

    If ‘2-point’ will use forward difference to approximate the gradient.

    If ‘3-point’ will use central difference to approximate the gradient.

    If ‘NLOpt’, must be in the form jac(x, grad, *args), where x is the argument in the form of a 1-D array, grad a 1-D array containing the gradient and args is a tuple of any additional fixed parameters needed to completely specify the function.

  • x0 ({ndarray, 'random'}, optional, default 'random') – Initial parameter vector guess. If ndarray, must be a 1-D array if ‘random’, picks a random initial guess in the feasible region.

  • sobol_sampling (bool, optional, default True) – If True, starting points for local minimizations are sampled from a Sobol sequence.

  • population (int, optional, default 4) – Number of local searches per iteration.

  • local_minimizer (string, optional, default 'auto') –

    Local Optimization algorithm to use. If string, Should be one of

    • ’lbfgs’: Limited-memory Broyden-Fletcher Goldfarb Shanno algorithm

    • ’slsqp’: Sequential least squares programming

    • ’mma’: Method of moving asymptotes

    • ’ccsaq’: conservative convex separable approximation

    • ’tnewton’: truncated Newton

    • ’tnewton_restart’: truncated Newton with restarting

    • ’tnewton_precond’: truncated Newton with preconditioning

    • ’tnewton_precond_restart’: truncated Newton with preconditioning and restarting

    • ’var1’: Shifted limited-memory variable metric with rank 1-method

    • ’var2’: Shifted limited-memory variable metric with rank 2-method

    • ’bobyqa’: Bounded optimization by quadratic approximation

    • ’cobyla’: Constrained optimization by linear approximation

    • ’neldermead’: Nelder-Mead optimization

    • ’sbplx’: Subplex algorithm

    • ’praxis’: Principal Axis algorithm

    • ’auto’

    See NLopt documentation for a detailed description of these methods.

    If ‘auto’, will default to “lbfgs” if jac!= None and “bobyqa” if jac=None

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (1e-8, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions for gradient based local optimizer, and 10.000 * problem dimensional for gradient free local optimizer

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated

  • local_mimimizer_options (dict) – Further options supplied to the local minimizer

  • solver_options (dict) – Further options supplied to the global MLSL minimizer

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

A. H. G. Rinnooy Kan and G. T. Timmer, “Stochastic global optimization methods,” Mathematical Programming, vol. 39, p. 27-78 (1987)

Sergei Kucherenko and Yury Sytsko, “Application of deterministic low-discrepancy sequences in global optimization,” Computational Optimization and Applications, vol. 30, p. 297-318 (2005)

simplenlopt.stogo(fun, bounds, args=(), jac=None, x0='random', randomize=False, ftol_rel=1e-08, xtol_rel=1e-06, ftol_abs=1e-14, xtol_abs=1e-08, maxeval='auto', maxtime=None, solver_options={})

Global optimization via STOchastic Global Optimization (STOGO)

Note

STOGO does not seem to respect the relative and absolute convergence criteria. By default, it will always run for the maximal number of iterations.

Parameters
  • fun (callable) – The objective function to be minimized. Must be in the form fun(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function

  • bounds (tuple of array-like) – Bounds for variables. (min, max) pairs for each element in x, defining the finite lower and upper bounds for the optimizing argument of fun. It is required to have len(bounds) == len(x). len(bounds) is used to determine the number of parameters in x.

  • args (list, optional, default ()) – Further arguments to describe the objective function

  • jac ({callable, '2-point', '3-point', 'NLOpt', bool}, optional, default None) –

    If callable, must be in the form jac(x, *args), where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function.

    If ‘2-point’ will use forward difference to approximate the gradient.

    If ‘3-point’ will use central difference to approximate the gradient.

    If ‘NLOpt’, must be in the form jac(x, grad, *args), where x is the argument in the form of a 1-D array, grad a 1-D array containing the gradient and args is a tuple of any additional fixed parameters needed to completely specify the function.

  • x0 ({ndarray, 'random'}, optional, default 'random') –

    Initial parameter vector guess.

    If ndarray, must be a 1-D array.

    If ‘random’, picks a random initial guess in the feasible region.

  • randomize (bool, optional, default False) – If True, randomizes the branching process

  • ftol_rel (float, optional, default 1e-8) – Relative function tolerance to signal convergence

  • xtol_rel (float, optional, default 1e-6) – Relative parameter vector tolerance to signal convergence

  • ftol_abs (float, optional, default 1e-14) – Absolute function tolerance to signal convergence

  • xtol_abs (float, optional, default 1e-8) – Absolute parameter vector tolerance to signal convergence

  • maxeval ({int, 'auto'}, optional, default 'auto') – Number of maximal function evaluations. If ‘auto’, set to 1.000 * dimensions

  • maxtime (float, optional, default None) – maximum absolute time until the optimization is terminated.

  • solver_options (dict, optional, default None) – Dictionary of additional options supplied to the solver.

Returns

result – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, fun the value of the function at the solution, and message which describes the cause of the termination. See OptimizeResult for a description of other attributes.

Return type

OptimizeResult

Notes

References:

S. Zertchaninov and K. Madsen, “A C++ Programme for Global Optimization,” IMM-REP-1998-04, Department of Mathematical Modelling, Technical University of Denmark, DK-2800 Lyngby, Denmark, 1998

S. Gudmundsson, “Parallel Global Optimization,” M.Sc. Thesis, IMM, Technical University of Denmark, 1998

Indices and tables