Line search and step size settings

The step size dictates how far one traverses along a local descent direction. More specifically, the step size $\gamma_t$ is used at each iteration to determine how much the next iterate moves towards the new vertex:

\[x_{t+1} = x_t - \gamma_t (x_t - v_t).\]

$\gamma_t = 1$ implies that the next iterate is exactly the vertex, a zero $\gamma_t$ implies that the iterate is not moving.

The following are step size selection rules for Frank Wolfe algorithms. Some methodologies (e.g. FixedStep and Agnostic) depend only on the iteration number and induce series $\gamma_t$ that are independent of the problem data, while others (e.g. GoldenSearch and Adaptive) change according to local information about the function; the adaptive methods often require extra function and/or gradient computations. The typical options for convex optimization are Agnostic or Adaptive.

All step size computation strategies are subtypes of FrankWolfe.LineSearchMethod. The key method they have to implement is FrankWolfe.perform_line_search which is called at every iteration to compute the step size gamma.

FrankWolfe.LineSearchMethod — Type

LineSearchMethod

Line search method to apply once the direction is computed. A LineSearchMethod must implement

perform_line_search(ls::LineSearchMethod, t, f, grad!, gradient, x, d, gamma_max, workspace)

with d = x - v. It may also implement FrankWolfe.build_linesearch_workspace.

FrankWolfe.perform_line_search — Function

perform_line_search(ls::LineSearchMethod, t, f, grad!, gradient, x, d, gamma_max, workspace)

Returns the step size gamma for step size strategy ls.

FrankWolfe.Adaptive — Type

Modified adaptive line search test from:

S. Pokutta "The Frank-Wolfe algorith: a short introduction" (2023), preprint, https://arxiv.org/abs/2311.05313

It replaces the original test implemented in the AdaptiveZerothOrder line search based on:

Pedregosa, F., Negiar, G., Askari, A., and Jaggi, M. (2020). "Linearly convergent Frank–Wolfe with backtracking line-search", Proceedings of AISTATS.

FrankWolfe.AdaptiveZerothOrder — Type

Slight modification of the Adaptive Step Size strategy from Pedregosa, Negiar, Askari, Jaggi (2018)

\[ f(x_t + \gamma_t (x_t - v_t)) - f(x_t) \leq - \alpha \gamma_t \langle \nabla f(x_t), x_t - v_t \rangle + \alpha^2 \frac{\gamma_t^2 \|x_t - v_t\|^2}{2} M ~.\]

The parameter alpha ∈ (0,1] relaxes the original smoothness condition to mitigate issues with nummerical errors. Its default value is 0.5. The Adaptive struct keeps track of the Lipschitz constant estimate L_est. The keyword argument relaxed_smoothness allows testing with an alternative smoothness condition,

\[ \langle \nabla f(x_t + \gamma_t (x_t - v_t) ) - \nabla f(x_t), x_t - v_t \rangle \leq \gamma_t M \|x_t - v_t\|^2 ~.\]

This condition yields potentially smaller and more stable estimations of the Lipschitz constant while being more computationally expensive due to the additional gradient computation.

It is also the fallback when the Lipschitz constant estimation fails due to numerical errors. perform_line_search also has a should_upgrade keyword argument on whether there should be a temporary upgrade to BigFloat for extended precision.

FrankWolfe.Agnostic — Type

Computes step size: l/(l + t) at iteration t, given l > 0.

Using l > 2 leads to faster convergence rates than l = 2 over strongly and some uniformly convex set.

Accelerated Affine-Invariant Convergence Rates of the Frank-Wolfe Algorithm with Open-Loop Step-Sizes, Wirth, Peña, Pokutta (2023), https://arxiv.org/abs/2310.04096

See also the paper that introduced the study of open-loop step-sizes with l > 2:

Acceleration of Frank-Wolfe Algorithms with Open-Loop Step-Sizes, Wirth, Kerdreux, Pokutta, (2023), https://arxiv.org/abs/2205.12838

Fixing l = -1, results in the step size gamma_t = (2 + log(t+1)) / (t + 2 + log(t+1))

S. Pokutta "The Frank-Wolfe algorith: a short introduction" (2023), https://arxiv.org/abs/2311.05313

FrankWolfe.Backtracking — Type

Backtracking(limit_num_steps, tol, tau)

Backtracking line search strategy, see Pedregosa, Negiar, Askari, Jaggi (2018).

FrankWolfe.FixedStep — Type

Fixed step size strategy. The step size can still be truncated by the gamma_max argument.

FrankWolfe.GeneralizedAgnostic — Type

Computes step size: g(t)/(t + g(t)) at iteration t, given g: R_{>= 0} -> R_{>= 0}.

Defaults to the best open-loop step-size gamma_t = (2 + log(t+1)) / (t + 2 + log(t+1))

S. Pokutta "The Frank-Wolfe algorith: a short introduction" (2023), https://arxiv.org/abs/2311.05313

This step-size is as fast as the step-size gammat = 2 / (t + 2) up to polylogarithmic factors. Further, over strongly convex and some uniformly convex sets, it is faster than any traditional step-size gammat = l / (t + l) for any l in N.

FrankWolfe.Goldenratio — Type

Goldenratio

Simple golden-ratio based line search Golden Section Search, based on Combettes, Pokutta (2020) code and adapted.

FrankWolfe.MonotonicNonConvexStepSize — Type

MonotonicNonConvexStepSize{F}

Represents a monotonic open-loop non-convex step size. Contains a halving factor N increased at each iteration until there is primal progress gamma = 1 / sqrt(t + 1) * 2^(-N).

FrankWolfe.MonotonicStepSize — Type

MonotonicStepSize{F}

Represents a monotonic open-loop step size. Contains a halving factor N increased at each iteration until there is primal progress gamma = 2 / (t + 2) * 2^(-N).

FrankWolfe.Nonconvex — Type

Computes step size: 1/sqrt(t + 1).

FrankWolfe.Secant — Type

Secant(limit_num_steps, tol, domain_oracle)

Secant line search strategy, which iteratively refines the step size using the secant method. This method is geared towards problems with self-concordant functions (but might require extra structure) and potentially faster than the backtracking line search. The order of convergence is superlinear with exponent 1.618 (Golden Ratio) but not quite quadratic. Convergence is not guaranteed in general.

Arguments

inner_ls::LSM: A fallback line search in case the last gamma in Secant does not satisfy the tolerance. Is only used if safe==true. (default Backtracking)
safe::Bool: Flag indicating whether the fallback line search should be used. If false, the best gamma from Secant is used. (default true)
limit_num_steps::Int: Maximum number of iterations for the secant method. (default 40)
tol::Float64: Tolerance for convergence. (default 1e-8)
domain_oracle::Function, returns true if the argument x is in the domain of the objective function f.

References

Secant Method

FrankWolfe.Shortstep — Type

Computes the 'Short step' step size: dual_gap / (L * norm(x - v)^2), where L is the Lipschitz constant of the gradient, x is the current iterate, and v is the current Frank-Wolfe vertex.

FrankWolfe.build_linesearch_workspace — Method

build_linesearch_workspace(ls::LS, x, gradient)

Optional method to implement for new line-search methods. Builds a workspace for the line search ls which will be used throughout an algorithm. A workspace may be any variable that will be passed along to perform_line_search, typically to avoid allocating intermediate structures (e.g., gradient evaluated at new points).

See Pedregosa, Negiar, Askari, Jaggi (2020) for the adaptive step size, Carderera, Besançon, Pokutta (2021) for the monotonic step size.

Index

FrankWolfe.Adaptive
FrankWolfe.AdaptiveZerothOrder
FrankWolfe.Agnostic
FrankWolfe.Backtracking
FrankWolfe.FixedStep
FrankWolfe.GeneralizedAgnostic
FrankWolfe.Goldenratio
FrankWolfe.LineSearchMethod
FrankWolfe.MonotonicNonConvexStepSize
FrankWolfe.MonotonicStepSize
FrankWolfe.Nonconvex
FrankWolfe.Secant
FrankWolfe.Shortstep
FrankWolfe.build_linesearch_workspace
FrankWolfe.perform_line_search