Algorithms

This section contains all main algorithms of the package. These are the ones typical users will call.

The typical signature for these algorithms is:

my_algorithm(f, grad!, lmo, x0)

Standard algorithms

FrankWolfe.frank_wolfe — Method

frank_wolfe(f, grad!, lmo, x0; kwargs...)

Simplest form of the Frank-Wolfe algorithm.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

The current variant may have additional keyword arguments, documented elsewhere, or it may only use a subset of the ones listed below. The default values of these arguments may also vary between variants, and thus are not part of the public API.

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.

source

FrankWolfe.lazified_conditional_gradient — Method

lazified_conditional_gradient(f, grad!, lmo_base, x0; kwargs...)

Similar to FrankWolfe.frank_wolfe but lazyfying the LMO: each call is stored in a cache, which is looked up first for a good-enough direction. The cache used is a FrankWolfe.MultiCacheLMO or a FrankWolfe.VectorCacheLMO depending on whether the provided cache_size option is finite.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.

source

FrankWolfe.block_coordinate_frank_wolfe — Function

block_coordinate_frank_wolfe(f, grad!, lmo::ProductLMO{N}, x0; kwargs...) where {N}

Block-coordinate version of the Frank-Wolfe algorithm. Minimizes objective f over the product of feasible domains specified by the lmo. The optional argument the update_order is of type FrankWolfe.BlockCoordinateUpdateOrder and controls the order in which the blocks are updated. The argument update_step is a single instance or tuple of FrankWolfe.UpdateStep and defines which FW-algorithms to use to update the iterates in the different blocks.

See S. Lacoste-Julien, M. Jaggi, M. Schmidt, and P. Pletscher 2013 and A. Beck, E. Pauwels and S. Sabach 2015 for more details about Block-Coordinate Frank-Wolfe.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

The method returns a tuple (x, v, primal, dual_gap, traj_data) with:

x cartesian product of final iterates
v cartesian product of last vertices of the LMOs
primal primal value f(x)
dual_gap final Frank-Wolfe gap
status the termination status
traj_data vector of trajectory information.

source

Active-set based methods

The following algorithms maintain the representation of the iterates as a convex combination of vertices.

Away-step

FrankWolfe.away_frank_wolfe — Method

away_frank_wolfe(f, grad!, lmo, x0; kwargs...)

Frank-Wolfe with away steps. The algorithm maintains the current iterate as a convex combination of vertices in the FrankWolfe.ActiveSet data structure. See M. Besançon, A. Carderera and S. Pokutta 2021 for illustrations of away steps.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data, active_set) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.
active_set: the computed active set of vertices, of which the solution is a convex combination

source

Pairwise Frank-Wolfe

FrankWolfe.blended_pairwise_conditional_gradient — Method

blended_pairwise_conditional_gradient(f, grad!, lmo, x0; kwargs...)

Implements the BPCG algorithm from Tsuji, Tanaka, Pokutta (2021). The method uses an active set of current vertices. Unlike away-step, it transfers weight from an away vertex to another vertex of the active set.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data, active_set) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.
active_set: the computed active set of vertices, of which the solution is a convex combination

source

FrankWolfe.blended_pairwise_conditional_gradient — Method

blended_pairwise_conditional_gradient(f, grad!, lmo, active_set::AbstractActiveSet; kwargs...)

Warm-starts BPCG with a pre-defined active_set.

source

FrankWolfe.pairwise_frank_wolfe — Method

pairwise_frank_wolfe(f, grad!, lmo, x0; kwargs...)

Frank-Wolfe with pairwise steps. The algorithm maintains the current iterate as a convex combination of vertices in the FrankWolfe.ActiveSet data structure. See M. Besançon, A. Carderera and S. Pokutta 2021 for illustrations of away steps. Unlike away-step, it transfers weight from an away vertex to another vertex.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data, active_set) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.
active_set: the computed active set of vertices, of which the solution is a convex combination

source

Blended Conditional Gradient

FrankWolfe.accelerated_simplex_gradient_descent_over_probability_simplex — Method

accelerated_simplex_gradient_descent_over_probability_simplex

Minimizes an objective function over the unit probability simplex until the Strong-Wolfe gap is below tolerance using Nesterov's accelerated gradient descent.

source

FrankWolfe.blended_conditional_gradient — Method

blended_conditional_gradient(f, grad!, lmo, x0; kwargs...)

Entry point for the Blended Conditional Gradient algorithm. See Braun, Gábor, et al. "Blended conditonal gradients" ICML 2019. The method works on an active set like FrankWolfe.away_frank_wolfe, performing gradient descent over the convex hull of active vertices, removing vertices when their weight drops to 0 and adding new vertices by calling the linear oracle in a lazy fashion.

Common arguments

These positional arguments are common to most Frank-Wolfe variants:

f: a function f(x) computing the value of the objective to minimize at point x
grad!: a function grad!(g, x) overwriting g with the gradient of f at point x
lmo: a linear minimization oracle, subtyping LinearMinimizationOracle
x0: a starting point for the optimization (will be modified in-place for frank_wolfe with InplaceEmphasis)

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data, active_set) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.
active_set: the computed active set of vertices, of which the solution is a convex combination

source

FrankWolfe.build_reduced_problem — Method

build_reduced_problem(atoms::AbstractVector{<:AbstractVector}, hessian, weights, gradient, tolerance)

Given an active set formed by vectors , a (constant) Hessian and a gradient constructs a quadratic problem over the unit probability simplex that is equivalent to minimizing the original function over the convex hull of the active set. If λ are the barycentric coordinates of dimension equal to the cardinality of the active set, the objective function is:

f(λ) = reduced_linear^T λ + 0.5 * λ^T reduced_hessian λ

In the case where we find that the current iterate has a strong-Wolfe gap over the convex hull of the active set that is below the tolerance we return nothing (as there is nothing to do).

source

FrankWolfe.lp_separation_oracle — Method

Returns either a tuple (y, val) with y an atom from the active set satisfying the progress criterion and val the corresponding gap dot(y, direction) or the same tuple with y from the LMO.

inplace_loop controls whether the iterate type allows in-place writes. kwargs are passed on to the LMO oracle.

source

FrankWolfe.minimize_over_convex_hull! — Method

minimize_over_convex_hull!

Given a function f with gradient grad! and an active set active_set this function will minimize the function over the convex hull of the active set until the strong-wolfe gap over the active set is below tolerance.

It will either directly minimize over the convex hull using simplex gradient descent, or it will transform the problem to barycentric coordinates and minimize over the unit probability simplex using gradient descent or Nesterov's accelerated gradient descent.

source

FrankWolfe.projection_simplex_sort — Method

projection_simplex_sort(x; s=1.0)

Perform a projection onto the probability simplex of radius s using a sorting algorithm.

source

FrankWolfe.simplex_gradient_descent_over_convex_hull — Method

simplex_gradient_descent_over_convex_hull(f, grad!, gradient, active_set, tolerance, t, time_start, non_simplex_iter)

Minimizes an objective function over the convex hull of the active set until the Strong-Wolfe gap is below tolerance using simplex gradient descent.

source

FrankWolfe.simplex_gradient_descent_over_probability_simplex — Method

simplex_gradient_descent_over_probability_simplex

Minimizes an objective function over the unit probability simplex until the Strong-Wolfe gap is below tolerance using gradient descent.

source

FrankWolfe.strong_frankwolfe_gap — Method

Checks the strong Frank-Wolfe gap for the reduced problem.

source

FrankWolfe.strong_frankwolfe_gap_probability_simplex — Method

strong_frankwolfe_gap_probability_simplex

Compute the Strong-Wolfe gap over the unit probability simplex given a gradient.

source

Blended Pairwise Conditional Gradient

Corrective Frank-Wolfe

FrankWolfe.corrective_frank_wolfe — Method

corrective_frank_wolfe(f, grad!, lmo, corrective_step, active_set::AS; kwargs...)

A corrective Frank-Wolfe variant with corrective step defined by corrective_step.

A corrective FW algorithm alternates between a standard FW step at which a vertex is added to the active set and a corrective step at which an update is performed in the convex hull of current vertices. Examples of corrective FW algorithms include blended (pairwise) conditional gradients, away-step Frank-Wolfe, and fully-corrective Frank-Wolfe.

Common keyword arguments

These keyword arguments are common to most Frank-Wolfe variants.

Warning

line_search::LineSearchMethod: an object specifying the line search and its parameters (see LineSearchMethod)
momentum::Union{Real,Nothing}=nothing: constant momentum to apply to the gradient
epsilon::Real: absolute dual gap threshold at which the algorithm is interrupted
max_iteration::Integer: maximum number of iterations after which the algorithm is interrupted
print_iter::Integer: interval between two consecutive log prints, expressed in number of iterations
trajectory::Bool=false: whether to record the trajectory of algorithm states (through callbacks)
verbose::Bool: whether to print periodic logs (through callbacks)
memory_mode::MemoryEmphasis: an object dictating whether the algorithm operates in-place or out-of-place (see MemoryEmphasis)
gradient=nothing: pre-allocated container for the gradient
callback=nothing: function called on a CallbackState at each iteration
traj_data=[]: pre-allocated storage for the trajectory of algorithm states
timeout::Real=Inf: maximum time after which the algorithm is interrupted (in nanoseconds)
linesearch_workspace=nothing: pre-allocated workspace for the line search
dual_gap_compute_frequency::Integer=1: frequency of dual gap computation
x_container=nothing: pre-allocated container for the iterates of the algorithm
d_container=nothing: pre-allocated container for the update directions computed by the algorithm

Return

Returns a named tuple (; x, v, primal, dual_gap, status, traj_data, active_set) with:

x: the final iterate
v: the last vertex from the linear minimization oracle
primal: the final primal value f(x)
dual_gap: the final Frank-Wolfe gap
status: the ExecutionStatus with which the algorithm terminated
traj_data: a vector of trajectory information, each element being the output of callback_state.
active_set: the computed active set of vertices, of which the solution is a convex combination

source

FrankWolfe.AwayStep — Type

(Lazified) away-step for corrective Frank-Wolfe

source

FrankWolfe.HybridPairAwayStep — Type

Compares a pairwise and away step and chooses the one with most progress. The line search is computed for both steps. If one step incurs a drop, it is favored, otherwise the one decreasing the primal value the most is favored.

source

FrankWolfe.PairwiseStep — Type

Computes a classic pairwise step, i.e., d = v^FW - v^away.

source

FrankWolfe.prepare_corrective_step — Function

prepare_corrective_step(corrective_step::CS, f, grad!, gradient, active_set, t, lmo, primal, phi_value, tot_time) -> (should_compute_vertex, use_corrective)

should_compute_vertex is a boolean flag deciding whether a new vertex should be computed. use_corrective is a function that takes the vertex (the vertex is a valid new vertex only if shouldcomputevertex was true)

source

FrankWolfe.run_corrective_step — Function

run_corrective_step(corrective_step, f, grad!, gradient, x, active_set, t, lmo, line_search, linesearch_workspace, primal, phi_value, tot_time, callback, renorm_interval) -> (x, phi_value, primal)

Corrective step method specific to the CS correctivestep type. The corrective step can perform whatever update over the current active set, the function should return the new iterate a FW step should be run next with the boolean `shouldfwstepand compute a new dual gap estimatephivalue`.

source

Alternating Methods

Problems over intersections of convex sets, i.e.

\[\min_{x \in \bigcap_{i=1}^n P_i} f(x),\]

pose a challenge as one has to combine the information of two or more LMOs.

FrankWolfe.alternating_linear_minimization converts the problem over the intersection of sets into a series of subproblems over single sets. To find a point within the intersection, one minimizes both the distance to the iterates of the other subproblems and the original objective function.

FrankWolfe.alternating_projections solves feasibility problems over intersections of feasible regions.

FrankWolfe.alternating_linear_minimization — Method

alternating_linear_minimization(bc_algo::BlockCoordinateMethod, f, grad!, lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}

Alternating Linear Minimization minimizes the objective f over the intersections of the feasible domains specified by lmos. The tuple x0 defines the initial points for each domain. Returns a tuple (x, v, primal, dual_gap, dist2, traj_data) with:

x cartesian product of final iterates
v cartesian product of last vertices of the LMOs
primal primal value f(x)
dual_gap final Frank-Wolfe gap
status is the termination status
dist2 is 1/2 of the sum of squared, pairwise distances between iterates
traj_data vector of trajectory information.

source

FrankWolfe.alternating_projections — Method

alternating_projections(lmos::NTuple{N,LinearMinimizationOracle}, x0; ...) where {N}

Computes a point in the intersection of feasible domains specified by lmos. Returns a named tuple (; x, v, dual_gap, dist2, status, traj_data) with:

x cartesian product of final iterates
v cartesian product of last vertices of the LMOs
dual_gap final Frank-Wolfe gap
dist2 is 1/2 * sum of squared, pairwise distances between iterates
status the ExecutionStatus of the algorithm
traj_data vector of trajectory information.

source

Difference-of-Convex Algorithm with Frank-Wolfe

FrankWolfe._boost_line_search_basic — Method

_boost_line_search_basic(objective_function, x_old, x_new; kwargs...)

Simple line search for the boosted DCA variant to find optimal convex combination.

Finds α* ∈ [0,1] that minimizes φ(α·xnew + (1-α)·xold) where φ is the original objective function. Uses coarse grid search followed by local refinement.

Arguments

objective_function: Original objective φ(x) = f(x) - g(x)
x_old: Previous iterate x_t
x_new: New iterate from DCA subproblem
max_iter: Maximum refinement iterations (default: 20)
shrink_factor: Step size reduction factor (default: 0.5)
min_step: Minimum step size for refinement (default: 1e-6)

Returns

α*: Optimal mixing parameter in [0,1]

Note

This is a simple implementation suitable for non-convex objectives. More sophisticated line search methods could be substituted.

source

FrankWolfe._dca_linearized_gradient! — Method

_dca_linearized_gradient!(storage, x, grad_f!, grad_g_at_xt)

Compute the gradient of the DCA linearized objective: ∇m(x) = ∇f(x) - ∇g(x_t).

Arguments

storage: Pre-allocated array to store the gradient result
x: Point where to evaluate the gradient
grad_f!: Gradient function for f (modifies storage in-place)
grad_g_at_xt: Precomputed gradient ∇g(x_t) (constant for the subproblem)

Side Effects

Modifies storage to contain ∇m(x) = ∇f(x) - ∇g(x_t)

source

FrankWolfe._dca_linearized_objective — Method

_dca_linearized_objective(x, f, g_value_at_xt, grad_g_at_xt, grad_g_dot_xt)

Compute the DCA linearized objective function m(x) = f(x) - g(xt) - ⟨∇g(xt), x - x_t⟩.

This is mathematically equivalent to: m(x) = f(x) - g(xt) - ⟨∇g(xt), x⟩ + ⟨∇g(xt), xt⟩

where the terms g(xt) and ⟨∇g(xt), x_t⟩ are constants for the optimization.

Arguments

x: Current point where to evaluate the function
f: Original convex function f
g_value_at_xt: Precomputed value g(x_t)
grad_g_at_xt: Precomputed gradient ∇g(x_t)
grad_g_dot_xt: Precomputed dot product ⟨∇g(xt), xt⟩

Returns

Value of the linearized objective m(x)

source

FrankWolfe.dca_fw — Method

dca_fw(f, grad_f!, g, grad_g!, lmo, x0; kwargs...)

Difference of Convex Algorithm with Frank-Wolfe (DCA-FW) for minimizing φ(x) = f(x) - g(x).

This algorithm solves non-convex optimization problems where the objective can be written as the difference of two convex functions. At each iteration, it:

Linearizes the concave part g around the current point x_t
Solves the convex subproblem minx f(x) - ⟨∇g(xt), x⟩ using Frank-Wolfe
Updates x_{t+1} to the subproblem solution

Algorithm Parameters

f: Convex function (differentiable)
grad_f!: In-place gradient of f
g: Convex function (differentiable) - will be subtracted from f
grad_g!: In-place gradient of g
lmo: Linear Minimization Oracle for the constraint set
x0: Initial point (must be feasible)

Optimization Parameters

epsilon: Convergence tolerance for DCA gap (default: 1e-7)
max_iteration: Maximum outer DCA iterations (default: 10000)
max_inner_iteration: Maximum inner Frank-Wolfe iterations (default: 1000)
line_search: Line search method for inner Frank-Wolfe (default: Agnostic())

Algorithm Variants

use_corrective_fw: Use corrective FW based on an active set instead of standard Frank-Wolfe (default: true)
use_dca_early_stopping: Enable early stopping based on DCA optimality (default: true)
boosted: Use boosted variant with convex combinations (default: false)

Output Control

verbose: Print algorithm progress (default: false)
verbose_inner: Print inner Frank-Wolfe progress (default: false)
print_iter: Print frequency for outer iterations (default: 10)
trajectory: Store trajectory data (default: false)
timeout: Maximum wall-clock time in seconds (default: Inf)

Memory Management

memory_mode: InplaceEmphasis() or OutplaceEmphasis() (default: InplaceEmphasis())
grad_f_workspace: Pre-allocated gradient workspace (optional)
grad_g_workspace: Pre-allocated gradient workspace (optional)
effective_grad_workspace: Pre-allocated effective gradient workspace (optional)

Advanced Options

callback: Custom callback function (optional)
traj_data: External trajectory storage (default: [])
linesearch_workspace: Workspace for line search (optional)
boost_line_search: Custom line search for boosted variant (optional)

Returns

Named tuple with:

x: Final iterate x_T
primal: Final objective value φ(xT) = f(xT) - g(x_T)
dca_gap: Final DCA gap estimate
iterations: Number of outer iterations performed
status: ExecutionStatus with which the algorithm terminated
traj_data: Trajectory data (if trajectory=true)

DCA Gap Definition

The DCA gap is defined as: DCAgap = φ(xt) - m(x{t+1}) + FWgap(x_{t+1})

where:

φ(xt) = f(xt) - g(x_t) is the original objective at current point
m(x_{t+1}) is the linearized objective at the new point
FWgap(x{t+1}) is the Frank-Wolfe gap at the new point

This gap measures progress toward DCA stationarity and converges to 0.

References

Pokutta, S. (2025). Scalable DC Optimization via Adaptive Frank-Wolfe Algorithms. https://arxiv.org/abs/2507.17545
Maskan, H., Hou, Y., Sra, S., and Yurtsever, A. (2025). Revisiting Frank-Wolfe for Structured Nonconvex Optimization. https://arxiv.org/abs/2503.08921
Pham Dinh, T., & Le Thi, H. A. (1997). Convex analysis approach to DC programming
Beck, A., & Guttmann-Beck, N. (2019). FW-DCA for non-convex regularized problems

source

FrankWolfe.make_dca_early_stopping_callback — Method

make_dca_early_stopping_callback(wrapped_callback, linearized_obj, grad_linearized_obj!, phi_value_at_xt)

Create a callback function that implements early stopping for DCA subproblems.

The early stopping criterion is based on the DCA optimality condition: m(x) ≤ φ(x_t) - ⟨∇m(x), x - v⟩

where:

m(x) is the linearized objective function
φ(xt) = f(xt) - g(xt) is the original objective at xt
x is the current iterate in the Frank-Wolfe subproblem
v is the Frank-Wolfe vertex (extreme point)

Arguments

wrapped_callback: Optional callback to wrap (can be nothing)
linearized_obj: Linearized objective function m(x)
grad_linearized_obj!: Gradient of linearized objective ∇m(x)
phi_value_at_xt: Value φ(xt) = f(xt) - g(x_t), constant for the subproblem

Returns

Callback function that returns false (stop) if criterion is satisfied, true otherwise

Mathematical Note

The criterion can be rearranged as: m(x) + ⟨∇m(x), x - v⟩ ≤ φ(x_t) This checks if the current point satisfies the DCA stationarity condition.

source