Algorithms#

Algorithms are the search strategies determining what configurations to evaluate next. In NePS, we provide a variety of pre-implemented algorithms and offer the possibility to implement custom algorithms. This chapter gives an overview of the different algorithms available in NePS and practical tips for their usage.

We distinguish between algorithms that use different types of information and strategies to guide the search process:

✅ = supported/necessary, ❌ = not supported, ✔️* = optional, click for details, ✖️* ignorable, click for details

Algorithm	Multi-Fidelity	Priors	Model-based
`Grid Search`	️️✖️*	❌	❌
`Random Search`	️️✖️*	✔️*	❌
`Bayesian Optimization`	️️✖️*	❌	✅
`Successive Halving`	✅	✔️*	❌
`ASHA`	✅	✔️*	❌
`Hyperband`	✅	✔️*	❌
`Asynch HB`	✅	✔️*	❌
`IfBO`	✅	✔️*	✅
`PiBO`	️️✖️*	✅	✅
`PriorBand`	✅	✅	✅

What is Multi-Fidelity Optimization?#

Multi-Fidelity (MF) optimization leverages the idea of running an AutoML problem on a small scale, which is cheaper and faster, and then using this information to train full-scale models. The low-fidelity runs could be on a smaller dataset, a smaller model, or for shorter training times. MF-algorithms then infer which configurations are likely to perform well on the full problem, before investing larger compute amounts.

Advantages of Multi-Fidelity

Parallelization: MF-algorithms can use the information from many parallel low-fidelity runs to guide the search in the few high-fidelity runs.
Exploration: By using low-fidelity runs, the optimizer can explore more of the search space.

Disadvantages of Multi-Fidelity

Variance: The performance of a configuration on a low-fidelity run might not correlate well with its performance on a high-fidelity run. This can result in misguided decisions.

We present a collection of MF-algorithms here and algorithms that combine MF with priors here.

What are Priors?#

Priors are used when there exists some information about the search space, that can be used to guide the optimization process. This information could come from expert domain knowledge or previous experiments. A Prior is provided in the form of a distribution over one dimension of the search space, with a mean (the suspected optimum) and a confidence level, or variance. We discuss how Priors can be included in your NePS-search space here.

Advantages of using Priors

Less compute: By providing a Prior, the optimizer can focus on the most promising regions of the search space, potentially saving a lot of compute.
More exploitation: By focusing on these regions, the optimizer might find a better final solution.

Disadvantages of using Priors

Less exploration: By focusing on these regions, the optimizer might miss out on other regions that could potentially be better.
Bad priors: If the Prior is not a good representation of the search space, the optimizer might deliver suboptimal results, compared to a search without Priors. The optimizers we provide in NePS are specifically designed to handle bad priors, but they still slow down the search process.

We present a collection of algorithms that use Priors here and algorithms that combine priors with Multi-Fidelity here.