Algorithms

This section contains all main algorithms of the package. These are the ones typical users will call.

The typical signature for these algorithms is:

my_algorithm(f, grad!, lmo, x0)

Standard algorithms

FrankWolfe.frank_wolfeMethod
frank_wolfe(f, grad!, lmo, x0; ...)

Simplest form of the Frank-Wolfe algorithm. Returns a tuple (x, v, primal, dual_gap, traj_data) with:

  • x final iterate
  • v last vertex from the LMO
  • primal primal value f(x)
  • dual_gap final Frank-Wolfe gap
  • traj_data vector of trajectory information.
source
FrankWolfe.stochastic_frank_wolfeMethod
stochastic_frank_wolfe(f::StochasticObjective, lmo, x0; ...)

Stochastic version of Frank-Wolfe, evaluates the objective and gradient stochastically, implemented through the FrankWolfe.StochasticObjective interface.

Keyword arguments include batch_size to pass a fixed batch_size or a batch_iterator implementing batch_size = FrankWolfe.batchsize_iterate(batch_iterator) for algorithms like Variance-reduced and projection-free stochastic optimization, E Hazan, H Luo, 2016.

Similarly, a constant momentum can be passed or replaced by a momentum_iterator implementing momentum = FrankWolfe.momentum_iterate(momentum_iterator).

source
FrankWolfe.block_coordinate_frank_wolfeFunction
block_coordinate_frank_wolfe(f, grad!, lmo::ProductLMO{N}, x0; ...) where {N}

Block-coordinate version of the Frank-Wolfe algorithm. Minimizes objective f over the product of feasible domains specified by the lmo. The optional argument the update_order is of type FrankWolfe.BlockCoordinateUpdateOrder and controls the order in which the blocks are updated. The argument update_step is a single instance or tuple of FrankWolfe.UpdateStep and defines which FW-algorithms to use to update the iterates in the different blocks.

The method returns a tuple (x, v, primal, dual_gap, traj_data) with:

  • x cartesian product of final iterates
  • v cartesian product of last vertices of the LMOs
  • primal primal value f(x)
  • dual_gap final Frank-Wolfe gap
  • traj_data vector of trajectory information.

See S. Lacoste-Julien, M. Jaggi, M. Schmidt, and P. Pletscher 2013 and A. Beck, E. Pauwels and S. Sabach 2015 for more details about Block-Coordinate Frank-Wolfe.

source

Active-set based methods

The following algorithms maintain the representation of the iterates as a convex combination of vertices.

Away-step

Pairwise Frank-Wolfe

Blended Conditional Gradient

FrankWolfe.blended_conditional_gradientMethod
blended_conditional_gradient(f, grad!, lmo, x0)

Entry point for the Blended Conditional Gradient algorithm. See Braun, Gábor, et al. "Blended conditonal gradients" ICML 2019. The method works on an active set like FrankWolfe.away_frank_wolfe, performing gradient descent over the convex hull of active vertices, removing vertices when their weight drops to 0 and adding new vertices by calling the linear oracle in a lazy fashion.

source
FrankWolfe.build_reduced_problemMethod
build_reduced_problem(atoms::AbstractVector{<:AbstractVector}, hessian, weights, gradient, tolerance)

Given an active set formed by vectors , a (constant) Hessian and a gradient constructs a quadratic problem over the unit probability simplex that is equivalent to minimizing the original function over the convex hull of the active set. If λ are the barycentric coordinates of dimension equal to the cardinality of the active set, the objective function is:

f(λ) = reduced_linear^T λ + 0.5 * λ^T reduced_hessian λ

In the case where we find that the current iterate has a strong-Wolfe gap over the convex hull of the active set that is below the tolerance we return nothing (as there is nothing to do).

source
FrankWolfe.lp_separation_oracleMethod

Returns either a tuple (y, val) with y an atom from the active set satisfying the progress criterion and val the corresponding gap dot(y, direction) or the same tuple with y from the LMO.

inplace_loop controls whether the iterate type allows in-place writes. kwargs are passed on to the LMO oracle.

source
FrankWolfe.minimize_over_convex_hull!Method
minimize_over_convex_hull!

Given a function f with gradient grad! and an active set active_set this function will minimize the function over the convex hull of the active set until the strong-wolfe gap over the active set is below tolerance.

It will either directly minimize over the convex hull using simplex gradient descent, or it will transform the problem to barycentric coordinates and minimize over the unit probability simplex using gradient descent or Nesterov's accelerated gradient descent.

source
FrankWolfe.simplex_gradient_descent_over_convex_hullMethod
simplex_gradient_descent_over_convex_hull(f, grad!, gradient, active_set, tolerance, t, time_start, non_simplex_iter)

Minimizes an objective function over the convex hull of the active set until the Strong-Wolfe gap is below tolerance using simplex gradient descent.

source

Blended Pairwise Conditional Gradient

Corrective Frank-Wolfe

FrankWolfe.corrective_frank_wolfeMethod
corrective_frank_wolfe(f, grad!, lmo, corrective_step, active_set::AS; kwargs...)

A corrective Frank-Wolfe variant with corrective step defined by corrective_step.

A corrective FW algorithm alternates between a standard FW step at which a vertex is added to the active set and a corrective step at which an update is performed in the convex hull of current vertices. Examples of corrective FW algorithms include blended (pairwise) conditional gradients, away-step Frank-Wolfe, and fully-corrective Frank-Wolfe.

source
FrankWolfe.HybridPairAwayStepType

Compares a pairwise and away step and chooses the one with most progress. The line search is computed for both steps. If one step incurs a drop, it is favored, otherwise the one decreasing the primal value the most is favored.

source
FrankWolfe.prepare_corrective_stepFunction
prepare_corrective_step(corrective_step::CS, f, grad!, gradient, active_set, t, lmo, primal, phi, tot_time) -> (should_compute_vertex, use_corrective)

should_compute_vertex is a boolean flag deciding whether a new vertex should be computed. use_corrective is a function that takes the vertex (the vertex is a valid new vertex only if shouldcomputevertex was true)

source
FrankWolfe.run_corrective_stepFunction
run_corrective_step(corrective_step, f, grad!, gradient, x, active_set, t, lmo, line_search, linesearch_workspace, primal, phi, tot_time, callback, renorm_interval) -> (x, phi, primal)

Corrective step method specific to the CS correctivestep type. The corrective step can perform whatever update over the current active set, the function should return the new iterate a FW step should be run next with the boolean `shouldfw_stepand compute a new dual gap estimatephi`.

source

Alternating Methods

Problems over intersections of convex sets, i.e.

\[\min_{x \in \bigcap_{i=1}^n P_i} f(x),\]

pose a challenge as one has to combine the information of two or more LMOs.

FrankWolfe.alternating_linear_minimization converts the problem into a series of subproblems over single sets. To find a point within the intersection, one minimizes both the distance to the iterates of the other subproblems and the original objective function.

FrankWolfe.alternating_projections solves feasibility problems over intersections of feasible regions.

FrankWolfe.alternating_linear_minimizationMethod
alternating_linear_minimization(bc_algo::BlockCoordinateMethod, f, grad!, lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}

Alternating Linear Minimization minimizes the objective f over the intersections of the feasible domains specified by lmos. The tuple x0 defines the initial points for each domain. Returns a tuple (x, v, primal, dual_gap, dist2, traj_data) with:

  • x cartesian product of final iterates
  • v cartesian product of last vertices of the LMOs
  • primal primal value f(x)
  • dual_gap final Frank-Wolfe gap
  • dist2 is 1/2 of the sum of squared, pairwise distances between iterates
  • traj_data vector of trajectory information.
source
FrankWolfe.alternating_projectionsMethod
alternating_projections(lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}

Computes a point in the intersection of feasible domains specified by lmos. Returns a tuple (x, v, dual_gap, dist2, traj_data) with:

  • x cartesian product of final iterates
  • v cartesian product of last vertices of the LMOs
  • dual_gap final Frank-Wolfe gap
  • dist2 is 1/2 * sum of squared, pairwise distances between iterates
  • traj_data vector of trajectory information.
source

Index