The ToySGD Benchmark

Task: control the learning rate and momentum of SGD in simple function approximation
Cost: log regret
Number of hyperparameters to control: two floats
State Information: remaining budget, gradient, current learning rate, current momentum
Noise Level: fairly small
Instance space: target function specification

This artificial benchmark uses functions like polynomials to test DAC controllers’ ability to control both learning rate and momentum of SGD. At each step until the cutoff, both hyperparameters are updated and one optimization step is taken. As we know the global optimum of the function, the cost is measured as the current regret.

By using function approximation, this benchmark is computationally cheap, so likely a good entry point before tackling the full-sizes SGD or CMA-ES step size benchmarks. It can also serve as a first test whether a DAC method can handle multiple hyperparameters at the same time.

Benchmark for Toysgd.

class dacbench.benchmarks.toysgd_benchmark.ToySGDBenchmark(config_path=None, config=None)[source]

Bases: AbstractBenchmark

SGD Benchmark with toy functions.

get_environment()[source]

Return SGDEnv env with current configuration.

Returns:

ToySGDEnv: ToySGD environment

read_instance_set(test=False)[source]

Read path of instances from config into list.

Environment for sgd with toy functions.

class dacbench.envs.toysgd.ToySGDEnv(config)[source]

Bases: AbstractMADACEnv

Optimize toy functions with SGD + Momentum.

Action: [log_learning_rate, log_momentum] (log base 10) State: Dict with entries remaining_budget, gradient, learning_rate, momentum Reward: negative log regret of current and true function value

An instance can look as follows: ID 0 family polynomial order 2 low -2 high 2 coefficients [ 1.40501053 -0.59899755 1.43337392]

close()[source]

Close env.

get_default_reward(_)[source]

Default reward: negative log regret.

get_default_state(_)[source]

Default state: remaining_budget, gradient, learning_rate, momentum.

render(**kwargs)[source]

Render progress.

reset(seed=None, options=None)[source]

Reset environment.

Parameters:
  • seed (int) – seed

  • options (dict) – options dict (not used)

  • Returns

  • -------

  • np.array – Environment state

  • dict – Meta-info

step(action: float | tuple[float, float]) tuple[dict[str, float], float, bool, dict][source]

Take one step with SGD.

Parameters:
  • action (Tuple[float, Tuple[float, float]]) – If scalar, action = (log_learning_rate) If tuple, action = (log_learning_rate, log_momentum)

  • Returns

  • -------

  • Tuple[Dict[str

    • stateDict[str, float]

      State with entries: “remaining_budget”, “gradient”, “learning_rate”, “momentum”

    • reward : float

    • terminated : bool

    • truncated : bool

    • info : Dict

  • float]

    • stateDict[str, float]

      State with entries: “remaining_budget”, “gradient”, “learning_rate”, “momentum”

    • reward : float

    • terminated : bool

    • truncated : bool

    • info : Dict

  • float

    • stateDict[str, float]

      State with entries: “remaining_budget”, “gradient”, “learning_rate”, “momentum”

    • reward : float

    • terminated : bool

    • truncated : bool

    • info : Dict

  • bool

    • stateDict[str, float]

      State with entries: “remaining_budget”, “gradient”, “learning_rate”, “momentum”

    • reward : float

    • terminated : bool

    • truncated : bool

    • info : Dict

  • Dict]

    • stateDict[str, float]

      State with entries: “remaining_budget”, “gradient”, “learning_rate”, “momentum”

    • reward : float

    • terminated : bool

    • truncated : bool

    • info : Dict

class dacbench.envs.toysgd.ToySGDInstance(function: AbstractFunction)[source]

Bases: object

Toy SGD Instance.