Main.check_gradients

Exact Optimization with Rational Arithmetic

This example can be found in section 4.3 in the paper. The package allows for exact optimization with rational arithmetic. For this, it suffices to set up the LMO to be rational and choose an appropriate step-size rule as detailed below. For the LMOs included in the package, this simply means initializing the radius with a rational-compatible element type, e.g., 1, rather than a floating-point number, e.g., 1.0. Given that numerators and denominators can become quite large in rational arithmetic, it is strongly advised to base the used rationals on extended-precision integer types such as BigInt, i.e., we use Rational{BigInt}.

The second requirement ensuring that the computation runs in rational arithmetic is a rational-compatible step-size rule. The most basic step-size rule compatible with rational optimization is the agnostic step-size rule with $\gamma_t = 2/(2 + t)$. With this step-size rule, the gradient does not even need to be rational as long as the atom computed by the LMO is of a rational type. Assuming these requirements are met, all iterates and the computed solution will then be rational.

using FrankWolfe
using LinearAlgebra

n = 100
k = n

x = fill(big(1) // 100, n)

f(x) = dot(x, x)
function grad!(storage, x)
    @. storage = 2 * x
end

grad! (generic function with 1 method)

pick feasible region radius needs to be integer or rational

lmo = FrankWolfe.ProbabilitySimplexOracle{Rational{BigInt}}(1)

FrankWolfe.ProbabilitySimplexOracle{Rational{BigInt}}(1//1)

compute some initial vertex

x0 = FrankWolfe.compute_extreme_point(lmo, zeros(n));

x, v, primal, dual_gap, trajectory = FrankWolfe.frank_wolfe(
    f,
    grad!,
    lmo,
    x0,
    max_iteration=k,
    line_search=FrankWolfe.Agnostic(),
    print_iter=k / 10,
    verbose=true,
    memory_mode=FrankWolfe.OutplaceEmphasis(),
);

println("\nOutput type of solution: ", eltype(x))


Vanilla Frank-Wolfe Algorithm.
MEMORY_MODE: FrankWolfe.OutplaceEmphasis() STEPSIZE: Agnostic(2) EPSILON: 1.0e-7 MAXITERATION: 100 TYPE: Rational{BigInt}
MOMENTUM: nothing GRADIENTTYPE: Nothing
LMO: FrankWolfe.ProbabilitySimplexOracle{Rational{BigInt}}

-------------------------------------------------------------------------------------------------
  Type     Iteration         Primal           Dual       Dual Gap           Time         It/sec
-------------------------------------------------------------------------------------------------
     I             1   1.000000e+00  -1.000000e+00   2.000000e+00   0.000000e+00            Inf
    FW            10   1.407407e-01  -1.407407e-01   2.814815e-01   4.857449e-01   2.058694e+01
    FW            20   6.842105e-02  -6.842105e-02   1.368421e-01   4.874273e-01   4.103176e+01
    FW            30   4.521073e-02  -4.521073e-02   9.042146e-02   4.891649e-01   6.132902e+01
    FW            40   3.376068e-02  -3.376068e-02   6.752137e-02   4.911179e-01   8.144685e+01
    FW            50   2.693878e-02  -2.693878e-02   5.387755e-02   4.931580e-01   1.013874e+02
    FW            60   2.241055e-02  -2.241055e-02   4.482109e-02   4.954011e-01   1.211140e+02
    FW            70   1.918565e-02  -1.918565e-02   3.837129e-02   4.981932e-01   1.405077e+02
    FW            80   1.677215e-02  -1.677215e-02   3.354430e-02   5.006451e-01   1.597938e+02
    FW            90   1.489804e-02  -1.489804e-02   2.979609e-02   5.030870e-01   1.788955e+02
    FW           100   1.340067e-02  -1.340067e-02   2.680135e-02   5.057572e-01   1.977233e+02
  Last           101   1.314422e-02  -1.236767e-02   2.551189e-02   5.066668e-01   1.993421e+02
-------------------------------------------------------------------------------------------------

Output type of solution: BigFloat

Another possible step-size rule is rationalshortstep which computes the step size by minimizing the smoothness inequality as $\gamma_t=\frac{\langle \nabla f(x_t),x_t-v_t\rangle}{2L||x_t-v_t||^2}$. However, as this step size depends on an upper bound on the Lipschitz constant $L$ as well as the inner product with the gradient $\nabla f(x_t)$, both have to be of a rational type.

@time x, v, primal, dual_gap, trajectory = FrankWolfe.frank_wolfe(
    f,
    grad!,
    lmo,
    x0,
    max_iteration=k,
    line_search=FrankWolfe.Shortstep(2 // 1),
    print_iter=k / 10,
    verbose=true,
    memory_mode=FrankWolfe.OutplaceEmphasis(),
);


Vanilla Frank-Wolfe Algorithm.
MEMORY_MODE: FrankWolfe.OutplaceEmphasis() STEPSIZE: Shortstep EPSILON: 1.0e-7 MAXITERATION: 100 TYPE: Rational{BigInt}
MOMENTUM: nothing GRADIENTTYPE: Nothing
LMO: FrankWolfe.ProbabilitySimplexOracle{Rational{BigInt}}

-------------------------------------------------------------------------------------------------
  Type     Iteration         Primal           Dual       Dual Gap           Time         It/sec
-------------------------------------------------------------------------------------------------
     I             1   1.000000e+00  -1.000000e+00   2.000000e+00   0.000000e+00            Inf
    FW            10   1.000000e-01  -1.000000e-01   2.000000e-01   4.077867e-01   2.452262e+01
    FW            20   5.000000e-02  -5.000000e-02   1.000000e-01   4.090682e-01   4.889161e+01
    FW            30   3.333333e-02  -3.333333e-02   6.666667e-02   4.104553e-01   7.308957e+01
    FW            40   2.500000e-02  -2.500000e-02   5.000000e-02   4.121292e-01   9.705693e+01
    FW            50   2.000000e-02  -2.000000e-02   4.000000e-02   4.141079e-01   1.207415e+02
    FW            60   1.666667e-02  -1.666667e-02   3.333333e-02   4.163936e-01   1.440944e+02
    FW            70   1.428571e-02  -1.428571e-02   2.857143e-02   4.189053e-01   1.671022e+02
    FW            80   1.250000e-02  -1.250000e-02   2.500000e-02   4.218492e-01   1.896412e+02
    FW            90   1.111111e-02  -1.111111e-02   2.222222e-02   4.249751e-01   2.117771e+02
    FW           100   1.000000e-02   1.000000e-02   1.889162e-78   4.284337e-01   2.334083e+02
  Last           100   1.000000e-02   1.000000e-02   2.159042e-78   4.291745e-01   2.330055e+02
-------------------------------------------------------------------------------------------------
  0.686799 seconds (1.65 M allocations: 92.688 MiB, 1.57% compilation time)

Note: at the last step, we exactly close the gap, finding the solution 1//n * ones(n)

This page was generated using Literate.jl.