smac.intensification.successive_halving module

class smac.intensification.successive_halving.SuccessiveHalving(stats: smac.stats.stats.Stats, traj_logger: smac.utils.io.traj_logging.TrajLogger, rng: numpy.random.mtrand.RandomState, instances: List[str], instance_specifics: Mapping[str, numpy.ndarray] = None, cutoff: Optional[float] = None, deterministic: bool = False, initial_budget: Optional[float] = None, max_budget: Optional[float] = None, eta: float = 3, num_initial_challengers: Optional[int] = None, run_obj_time: bool = True, n_seeds: Optional[int] = None, instance_order: Optional[str] = 'shuffle_once', adaptive_capping_slackfactor: float = 1.2, inst_seed_pairs: Optional[List[Tuple[str, int]]] = None, min_chall: int = 1, incumbent_selection: str = 'highest_executed_budget')[source]

Bases: smac.intensification.parallel_scheduling.ParallelScheduler

Races multiple challengers against an incumbent using Successive Halving method

Implementation following the description in “BOHB: Robust and Efficient Hyperparameter Optimization at Scale” (Falkner et al. 2018) Supplementary reference: http://proceedings.mlr.press/v80/falkner18a/falkner18a-supp.pdf

Successive Halving intensifier (and Hyperband) can operate on two kinds of budgets: 1. ‘Instances’ as budget:

When multiple instances are provided or when run objective is “runtime”, this is the criterion used as budget for successive halving iterations i.e., the budget determines how many instances the challengers are evaluated on at a time. Top challengers for the next iteration are selected based on the combined performance across all instances used.

If initial_budget and max_budget are not provided, then they are set to 1 and total number of available instances respectively by default.

  1. ‘Real-valued’ budget:

    This is used when there is only one instance provided and when run objective is “quality”, i.e., budget is a positive, real-valued number that can be passed to the target algorithm as an argument. It can be used to control anything by the target algorithm, Eg: number of epochs for training a neural network.

    initial_budget and max_budget are required parameters for this type of budget.

Examples for successive halving (and hyperband) can be found here: * Runtime objective and multiple instances (instances as budget): examples/spear_qcp/SMAC4AC_SH_spear_qcp.py * Quality objective and multiple instances (instances as budget): examples/BOHB4HPO_sgd_instances.py * Quality objective and single instance (real-valued budget): examples/BOHB4HPO_mlp.py

This class instantiates _SuccessiveHalving objects on a need basis, that is, to prevent workers from being idle. The actual logic that implements the Successive halving method lies on the _SuccessiveHalving class.

Parameters
  • stats (smac.stats.stats.Stats) – stats object

  • traj_logger (smac.utils.io.traj_logging.TrajLogger) – TrajLogger object to log all new incumbents

  • rng (np.random.RandomState) –

  • instances (typing.List[str]) – list of all instance ids

  • instance_specifics (typing.Mapping[str,np.ndarray]) – mapping from instance name to instance specific string

  • cutoff (typing.Optional[int]) – cutoff of TA runs

  • deterministic (bool) – whether the TA is deterministic or not

  • initial_budget (typing.Optional[float]) – minimum budget allowed for 1 run of successive halving

  • max_budget (typing.Optional[float]) – maximum budget allowed for 1 run of successive halving

  • eta (float) – ‘halving’ factor after each iteration in a successive halving run. Defaults to 3

  • num_initial_challengers (typing.Optional[int]) – number of challengers to consider for the initial budget. If None, calculated internally

  • run_obj_time (bool) – whether the run objective is runtime or not (if true, apply adaptive capping)

  • n_seeds (typing.Optional[int]) – Number of seeds to use, if TA is not deterministic. Defaults to None, i.e., seed is set as 0

  • instance_order (typing.Optional[str]) – how to order instances. Can be set to: [None, shuffle_once, shuffle] * None - use as is given by the user * shuffle_once - shuffle once and use across all SH run (default) * shuffle - shuffle before every SH run

  • adaptive_capping_slackfactor (float) – slack factor of adpative capping (factor * adaptive cutoff)

  • inst_seed_pairs (typing.List[typing.Tuple[str, int]], optional) – Do not set this argument, it will only be used by hyperband!

  • min_chall (int) – minimal number of challengers to be considered (even if time_bound is exhausted earlier). This class will raise an exception if a value larger than 1 is passed.

  • incumbent_selection (str) – How to select incumbent in successive halving. Only active for real-valued budgets. Can be set to: [highest_executed_budget, highest_budget, any_budget] * highest_executed_budget - incumbent is the best in the highest budget run so far (default) * highest_budget - incumbent is selected only based on the highest budget * any_budget - incumbent is the best on any budget i.e., best performance regardless of budget

_add_new_instance(num_workers: int) → bool[source]

Decides if it is possible to add a new intensifier instance, and adds it. If a new intensifier instance is added, True is returned, else False.

num_workers: int

the maximum number of workers available at a given time.

Returns

Return type

Whether or not a successive halving instance was added

_get_intensifier_ranking(intensifier: smac.intensification.abstract_racer.AbstractRacer) → Tuple[int, int][source]

Given a intensifier, returns how advance it is. This metric will be used to determine what priority to assign to the intensifier

Parameters

intensifier (AbstractRacer) – Intensifier to rank based on run progress

Returns

  • ranking (int) – the higher this number, the faster the intensifier will get the running resources. For hyperband we can use the sh_intensifier stage, for example

  • tie_breaker (int) – The configurations that have been launched to break ties. For example, in the case of Successive Halving it can be the number of configurations launched

class smac.intensification.successive_halving._SuccessiveHalving(stats: smac.stats.stats.Stats, traj_logger: smac.utils.io.traj_logging.TrajLogger, rng: numpy.random.mtrand.RandomState, instances: List[str], instance_specifics: Mapping[str, numpy.ndarray] = None, cutoff: Optional[float] = None, deterministic: bool = False, initial_budget: Optional[float] = None, max_budget: Optional[float] = None, eta: float = 3, _all_budgets: Optional[List[float]] = None, _n_configs_in_stage: Optional[List[int]] = None, num_initial_challengers: Optional[int] = None, run_obj_time: bool = True, n_seeds: Optional[int] = None, instance_order: Optional[str] = 'shuffle_once', adaptive_capping_slackfactor: float = 1.2, inst_seed_pairs: Optional[List[Tuple[str, int]]] = None, min_chall: int = 1, incumbent_selection: str = 'highest_executed_budget', identifier: int = 0)[source]

Bases: smac.intensification.abstract_racer.AbstractRacer

Races multiple challengers against an incumbent using Successive Halving method

This class contains the logic to implement: “BOHB: Robust and Efficient Hyperparameter Optimization at Scale” (Falkner et al. 2018) Supplementary reference: http://proceedings.mlr.press/v80/falkner18a/falkner18a-supp.pdf

The SuccessiveHalving class can create multiple _SuccessiveHalving objects, to allow parallelism in the method (up to the number of workers available). The user interface is expected to be SuccessiveHalving, yet this class (_SuccessiveHalving) contains the actual single worker implementation of the BOHB method.

Successive Halving intensifier (and Hyperband) can operate on two kinds of budgets: 1. ‘Instances’ as budget:

When multiple instances are provided or when run objective is “runtime”, this is the criterion used as budget for successive halving iterations i.e., the budget determines how many instances the challengers are evaluated on at a time. Top challengers for the next iteration are selected based on the combined performance across all instances used.

If initial_budget and max_budget are not provided, then they are set to 1 and total number of available instances respectively by default.

  1. ‘Real-valued’ budget:

    This is used when there is only one instance provided and when run objective is “quality”, i.e., budget is a positive, real-valued number that can be passed to the target algorithm as an argument. It can be used to control anything by the target algorithm, Eg: number of epochs for training a neural network.

    initial_budget and max_budget are required parameters for this type of budget.

Parameters
  • stats (smac.stats.stats.Stats) – stats object

  • traj_logger (smac.utils.io.traj_logging.TrajLogger) – TrajLogger object to log all new incumbents

  • rng (np.random.RandomState) –

  • instances (typing.List[str]) – list of all instance ids

  • instance_specifics (typing.Mapping[str,np.ndarray]) – mapping from instance name to instance specific string

  • cutoff (typing.Optional[int]) – cutoff of TA runs

  • deterministic (bool) – whether the TA is deterministic or not

  • initial_budget (typing.Optional[float]) – minimum budget allowed for 1 run of successive halving

  • max_budget (typing.Optional[float]) – maximum budget allowed for 1 run of successive halving

  • eta (float) – ‘halving’ factor after each iteration in a successive halving run. Defaults to 3

  • _all_budgets (typing.Optional[typing.List[float]] = None) – Used internally when HB uses SH as a subrouting

  • _n_configs_in_stage (typing.Optional[typing.List[int]] = None) – Used internally when HB uses SH as a subrouting

  • num_initial_challengers (typing.Optional[int]) – number of challengers to consider for the initial budget. If None, calculated internally

  • run_obj_time (bool) – whether the run objective is runtime or not (if true, apply adaptive capping)

  • n_seeds (typing.Optional[int]) – Number of seeds to use, if TA is not deterministic. Defaults to None, i.e., seed is set as 0

  • instance_order (typing.Optional[str]) – how to order instances. Can be set to: [None, shuffle_once, shuffle] * None - use as is given by the user * shuffle_once - shuffle once and use across all SH run (default) * shuffle - shuffle before every SH run

  • adaptive_capping_slackfactor (float) – slack factor of adpative capping (factor * adaptive cutoff)

  • inst_seed_pairs (typing.List[typing.Tuple[str, int]], optional) – Do not set this argument, it will only be used by hyperband!

  • min_chall (int) – minimal number of challengers to be considered (even if time_bound is exhausted earlier). This class will raise an exception if a value larger than 1 is passed.

  • incumbent_selection (str) – How to select incumbent in successive halving. Only active for real-valued budgets. Can be set to: [highest_executed_budget, highest_budget, any_budget] * highest_executed_budget - incumbent is the best in the highest budget run so far (default) * highest_budget - incumbent is selected only based on the highest budget * any_budget - incumbent is the best on any budget i.e., best performance regardless of budget

  • identifier (int) – Adds a numerical identifier on this SH instance. Used for debug and tagging logger messages properly

_compare_configs(incumbent: ConfigSpace.configuration_space.Configuration, challenger: ConfigSpace.configuration_space.Configuration, run_history: smac.runhistory.runhistory.RunHistory, log_traj: bool = True) → Optional[ConfigSpace.configuration_space.Configuration][source]

Compares the challenger with current incumbent and returns the best configuration, based on the given incumbent selection design.

Parameters
Returns

incumbent configuration

Return type

typing.Optional[Configuration]

_compare_configs_across_budgets(challenger: ConfigSpace.configuration_space.Configuration, incumbent: ConfigSpace.configuration_space.Configuration, run_history: smac.runhistory.runhistory.RunHistory, log_traj: bool = True) → Optional[ConfigSpace.configuration_space.Configuration][source]

compares challenger with current incumbent on any budget

Parameters
Returns

incumbent configuration

Return type

typing.Optional[Configuration]

_count_running_instances_for_challenger(run_history: smac.runhistory.runhistory.RunHistory) → int[source]

The intensifiers are called on a sequential manner. In each iteration, one can only return a configuration at a time, for that reason self.running_challenger tracks that more instance/seed pairs need to be launched for a given config.

This procedure counts the number of running instances/seed pairs for the current running challenger

_get_pending_instances_for_stage(run_history: smac.runhistory.runhistory.RunHistory) → int[source]

When running SH, M configs might require N instances. Before moving to the next stage, we need to make sure that all MxN jobs are completed

We use the run tracker to make sure we processed all configurations.

Parameters

run_history (RunHistory) – stores all runs we ran so far

Returns

int

Return type

All the instances that have not yet been processed

_init_sh_params(initial_budget: Optional[float], max_budget: Optional[float], eta: float, num_initial_challengers: Optional[int] = None, _all_budgets: Optional[List[float]] = None, _n_configs_in_stage: Optional[List[int]] = None) → None[source]

initialize Successive Halving parameters

Parameters
  • initial_budget (typing.Optional[float]) – minimum budget allowed for 1 run of successive halving

  • max_budget (typing.Optional[float]) – maximum budget allowed for 1 run of successive halving

  • eta (float) – ‘halving’ factor after each iteration in a successive halving run

  • num_initial_challengers (typing.Optional[int]) – number of challengers to consider for the initial budget

  • _all_budgets (typing.Optional[typing.List[float]] = None) – Used internally when HB uses SH as a subrouting

  • _n_configs_in_stage (typing.Optional[typing.List[int]] = None) – Used internally when HB uses SH as a subrouting

_launched_all_configs_for_current_stage(run_history: smac.runhistory.runhistory.RunHistory) → bool[source]

This procedure queries if the addition of currently finished configs and running configs are sufficient for the current stage. If more configs are needed, it will return False. :param run_history: stores all runs we ran so far :type run_history: RunHistory

Returns

bool

Return type

Whether or not to launch more configurations/instances/seed pairs

_top_k(configs: List[ConfigSpace.configuration_space.Configuration], run_history: smac.runhistory.runhistory.RunHistory, k: int) → List[ConfigSpace.configuration_space.Configuration][source]

Selects the top ‘k’ configurations from the given list based on their performance.

This retrieves the performance for each configuration from the runhistory and checks that the highest budget they’ve been evaluated on is the same for each of the configurations.

Parameters
Returns

top challenger configurations, sorted in increasing costs

Return type

typing.List[Configuration]

_update_stage(run_history: smac.runhistory.runhistory.RunHistory) → None[source]

Update tracking information for a new stage/iteration and update statistics. This method is called to initialize stage variables and after all configurations of a successive halving stage are completed.

Parameters

run_history (smac.runhistory.runhistory.RunHistory) – stores all runs we ran so far

get_next_run(challengers: Optional[List[ConfigSpace.configuration_space.Configuration]], incumbent: ConfigSpace.configuration_space.Configuration, chooser: Optional[smac.optimizer.epm_configuration_chooser.EPMChooser], run_history: smac.runhistory.runhistory.RunHistory, repeat_configs: bool = True, num_workers: int = 1) → Tuple[smac.intensification.abstract_racer.RunInfoIntent, smac.runhistory.runhistory.RunInfo][source]

Selects which challenger to use based on the iteration stage and set the iteration parameters. First iteration will choose configurations from the chooser or input challengers, while the later iterations pick top configurations from the previously selected challengers in that iteration

Parameters
Returns

  • intent (RunInfoIntent) – Indicator of how to consume the RunInfo object

  • run_info (RunInfo) – An object that encapsulates the minimum information to evaluate a configuration

process_results(run_info: smac.runhistory.runhistory.RunInfo, incumbent: Optional[ConfigSpace.configuration_space.Configuration], run_history: smac.runhistory.runhistory.RunHistory, time_bound: float, result: smac.runhistory.runhistory.RunValue, log_traj: bool = True) → Tuple[ConfigSpace.configuration_space.Configuration, float][source]

The intensifier stage will be updated based on the results/status of a configuration execution. Also, a incumbent will be determined.

Parameters
  • run_info (RunInfo) – A RunInfo containing the configuration that was evaluated

  • incumbent (typing.Optional[Configuration]) – Best configuration seen so far

  • run_history (RunHistory) – stores all runs we ran so far if False, an evaluated configuration will not be generated again

  • time_bound (float) – time in [sec] available to perform intensify

  • result (RunValue) – Contain the result (status and other methadata) of exercising a challenger/incumbent.

  • log_traj (bool) – Whether to log changes of incumbents in trajectory

Returns

  • incumbent (Configuration) – current (maybe new) incumbent configuration

  • inc_perf (float) – empirical performance of incumbent configuration