Main.check_gradients

Extra-lazification

Sometimes the Frank-Wolfe algorithm will be run multiple times with slightly different settings under which vertices collected in a previous run are still valid.

The extra-lazification feature can be used for this purpose. It consists of a storage that can collect dropped vertices during a run, and the ability to use these vertices in another run, when they are not part of the current active set. The vertices that are part of the active set do not need to be duplicated in the extra-lazification storage. The extra-vertices can be used instead of calling the LMO when it is a relatively expensive operation.

using FrankWolfe
using Test
using LinearAlgebra

We will use a parameterized objective function $1/2 \|x - c\|^2$ over the unit simplex.

const n = 100
const center0 = 5.0 .+ 3 * rand(n)
f(x) = 0.5 * norm(x .- center0)^2
function grad!(storage, x)
    return storage .= x .- center0
end

grad! (generic function with 1 method)

The TrackingLMO will let us count how many real calls to the LMO are performed by a single run of the algorithm.

lmo = FrankWolfe.UnitSimplexOracle(4.3)
tlmo = FrankWolfe.TrackingLMO(lmo)
x0 = FrankWolfe.compute_extreme_point(lmo, randn(n));

Adding a vertex storage

FrankWolfe offers a simple FrankWolfe.DeletedVertexStorage storage type which has as parameter return_kth, the number of good directions to find before returning the best. return_kth larger than the number of vertices means that the best-aligned vertex will be found. return_kth = 1 means the first acceptable vertex (with the specified threhsold) is returned.

See FrankWolfe.DeletedVertexStorage

vertex_storage = FrankWolfe.DeletedVertexStorage(typeof(x0)[], 5)
tlmo.counter = 0

results = FrankWolfe.blended_pairwise_conditional_gradient(
    f,
    grad!,
    tlmo,
    x0,
    max_iteration=4000,
    verbose=true,
    lazy=true,
    epsilon=1e-5,
    add_dropped_vertices=true,
    extra_vertex_storage=vertex_storage,
)

(x = [0.40888360533486434, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.24668091286378452, 0.0, 0.0, 0.0, 0.3856705242607134, 0.0, 0.0, 0.0], v = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], primal = 2114.012843766959, dual_gap = 5.738928329890314e-6, traj_data = Any[], active_set = Tuple{Float64, FrankWolfe.ScaledHotVector{Float64}}[(0.003474953505605373, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.11153306199151813, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.10971865056036759, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.10675045717875149, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.10077225252962349, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.09508921054299171, [4.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.08969081959551475, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 4.3, 0.0, 0.0, 0.0]), (0.08567862184803596, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.07574179602096497, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.060901136381411786, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.057367654154368494, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 4.3, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.054959370032634396, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]), (0.04832201565821181, [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])])

The counter indicates the number of initial calls to the LMO. We will now construct different objective functions based on new centers, call the BPCG algorithm while accumulating vertices in the storage, in addition to warm-starting with the active set of the previous iteration. This allows for a "double-warmstarted" algorithm, reducing the number of LMO calls from one problem to the next.

active_set = results[end]
tlmo.counter

for iter in 1:10
    center = 5.0 .+ 3 * rand(n)
    f_i(x) = 0.5 * norm(x .- center)^2
    function grad_i!(storage, x)
        return storage .= x .- center
    end
    tlmo.counter = 0
    FrankWolfe.blended_pairwise_conditional_gradient(
        f_i,
        grad_i!,
        tlmo,
        active_set,
        max_iteration=4000,
        lazy=true,
        epsilon=1e-5,
        add_dropped_vertices=true,
        use_extra_vertex_storage=true,
        extra_vertex_storage=vertex_storage,
        verbose=false,
    )
    @info "Number of LMO calls in iter $iter: $(tlmo.counter)"
    @info "Vertex storage size: $(length(vertex_storage.storage))"
end

[ Info: Number of LMO calls in iter 1: 25
[ Info: Vertex storage size: 8
[ Info: Number of LMO calls in iter 2: 28
[ Info: Vertex storage size: 22
[ Info: Number of LMO calls in iter 3: 25
[ Info: Vertex storage size: 38
[ Info: Number of LMO calls in iter 4: 18
[ Info: Vertex storage size: 52
[ Info: Number of LMO calls in iter 5: 17
[ Info: Vertex storage size: 56
[ Info: Number of LMO calls in iter 6: 21
[ Info: Vertex storage size: 62
[ Info: Number of LMO calls in iter 7: 14
[ Info: Vertex storage size: 69
[ Info: Number of LMO calls in iter 8: 22
[ Info: Vertex storage size: 71
[ Info: Number of LMO calls in iter 9: 17
[ Info: Vertex storage size: 77
[ Info: Number of LMO calls in iter 10: 16
[ Info: Vertex storage size: 82

This page was generated using Literate.jl.