smac.runner

Interfaces

class smac.runner.AbstractRunner(scenario, required_arguments=None)[source]

Bases: ABC

Interface class to handle the execution of SMAC configurations. This interface defines how to interact with the SMBO loop. The complexity of running a configuration as well as handling the results is abstracted to the SMBO via an AbstractRunner.

From SMBO perspective, launching a configuration follows a submit/collect scheme as follows:

  1. A run is launched via submit_run()

    • submit_run internally calls run_wrapper(), a method that contains common processing functions among different runners.

    • A class that implements AbstractRunner defines run() which is really the algorithm to translate a TrialInfo to a TrialValue, i.e. a configuration to an actual result.

  2. A completed run is collected via iter_results(), which iterates and consumes any finished runs, if any.

  3. This interface also offers the method wait() as a mechanism to make sure we have enough data in the next iteration to make a decision. For example, the intensifier might not be able to select the next challenger until more results are available.

Parameters:
  • scenario (Scenario)

  • required_arguments (list[str]) – A list of required arguments, which are passed to the target function.

abstract count_available_workers()[source]

Returns the number of available workers.

Return type:

int

abstract is_running()[source]

Whether there are trials still running.

Generally, if the runner is serial, launching a trial instantly returns its result. On parallel runners, there might be pending configurations to complete.

Return type:

bool

abstract iter_results()[source]

This method returns any finished configuration, and returns a list with the results of executing the configurations. This class keeps populating results to self._results_queue until a call to get_finished trials is done. In this case, the self._results_queue list is emptied and all trial values produced by running run are returned.

Returns:

A list of TrialInfo/TrialValue tuples, all of which have been finished.

Return type:

Iterator[tuple[TrialInfo, TrialValue]]

property meta: dict[str, Any]

Returns the meta-data of the created object.

abstract run(config, instance=None, budget=None, seed=None)[source]

Runs the target function with a configuration on a single instance-budget-seed combination (aka trial).

Parameters:
  • config (Configuration) – Configuration to be passed to the target function.

  • instance (str | None, defaults to None) – The Problem instance.

  • budget (float | None, defaults to None) – A positive, real-valued number representing an arbitrary limit to the target function handled by the target function internally.

  • seed (int, defaults to None)

Return type:

tuple[StatusType, float | list[float], float, dict]

Returns:

  • status (StatusType) – Status of the trial.

  • cost (float | list[float]) – Resulting cost(s) of the trial.

  • runtime (float) – The time the target function took to run.

  • additional_info (dict) – All further additional trial information.

run_wrapper(trial_info, **dask_data_to_scatter)[source]

Wrapper around run() to execute and check the execution of a given config. This function encapsulates common handling/processing, so that run() implementation is simplified.

Parameters:
  • trial_info (RunInfo) – Object that contains enough information to execute a configuration run in isolation.

  • dask_data_to_scatter (dict[str, Any]) – When a user scatters data from their local process to the distributed network, this data is distributed in a round-robin fashion grouping by number of cores. Roughly speaking, we can keep this data in memory and then we do not have to (de-)serialize the data every time we would like to execute a target function with a big dataset. For example, when your target function has a big dataset shared across all the target function, this argument is very useful.

Return type:

tuple[TrialInfo, TrialValue]

Returns:

  • info (TrialInfo) – An object containing the configuration launched.

  • value (TrialValue) – Contains information about the status/performance of config.

abstract submit_trial(trial_info)[source]

This function submits a configuration embedded in a TrialInfo object, and uses one of the workers to produce a result (such result will eventually be available on the self._results_queue FIFO).

This interface method will be called by SMBO, with the expectation that a function will be executed by a worker. What will be executed is dictated by trial_info, and how it will be executed is decided via the child class that implements a run method.

Because config submission can be a serial/parallel endeavor, it is expected to be implemented by a child class.

Parameters:

trial_info (TrialInfo) – An object containing the configuration launched.

Return type:

None

abstract wait()[source]

The SMBO/intensifier might need to wait for trials to finish before making a decision.

Return type:

None

class smac.runner.DaskParallelRunner(single_worker, patience=5, dask_client=None)[source]

Bases: AbstractRunner

Interface to submit and collect a job in a distributed fashion. DaskParallelRunner is intended to comply with the bridge design pattern. Nevertheless, to reduce the amount of code within single-vs-parallel implementations, DaskParallelRunner wraps a BaseRunner object which is then executed in parallel on n_workers.

This class then is constructed by passing an AbstractRunner that implements a run method, and is capable of doing so in a serial fashion. Next, this wrapper class uses dask to initialize N number of AbstractRunner that actively wait of a TrialInfo to produce a RunInfo object.

To be more precise, the work model is then:

  1. The intensifier dictates “what” to run (a configuration/instance/seed) via a TrialInfo object.

  2. An abstract runner takes this TrialInfo object and launches the task via submit_run. In the case of DaskParallelRunner, n_workers receive a pickle-object of DaskParallelRunner.single_worker, each with a run method coming from DaskParallelRunner.single_worker.run()

  3. TrialInfo objects are run in a distributed fashion, and their results are available locally to each worker. The result is collected by iter_results and then passed to SMBO.

  4. Exceptions are also locally available to each worker and need to be collected.

Dask works with Future object which are managed via the DaskParallelRunner.client.

Parameters:
  • single_worker (AbstractRunner) – A runner to run in a distributed fashion. Will be distributed using n_workers.

  • patience (int, default to 5) – How much to wait for workers (seconds) to be available if one fails.

  • dask_client (Client | None, defaults to None) – User-created dask client, which can be used to start a dask cluster and then attach SMAC to it. This will not be closed automatically and will have to be closed manually if provided explicitly. If none is provided (default), a local one will be created for you and closed upon completion.

__del__()[source]

Makes sure that when this object gets deleted, the client is terminated. This is only done if the client was created by the dask runner.

Return type:

None

close(force=False)[source]

Closes the client.

Return type:

None

count_available_workers()[source]

Total number of workers available. This number is dynamic as more resources can be allocated.

Return type:

int

is_running()[source]

Whether there are trials still running.

Generally, if the runner is serial, launching a trial instantly returns its result. On parallel runners, there might be pending configurations to complete.

Return type:

bool

iter_results()[source]

This method returns any finished configuration, and returns a list with the results of executing the configurations. This class keeps populating results to self._results_queue until a call to get_finished trials is done. In this case, the self._results_queue list is emptied and all trial values produced by running run are returned.

Returns:

A list of TrialInfo/TrialValue tuples, all of which have been finished.

Return type:

Iterator[tuple[TrialInfo, TrialValue]]

run(config, instance=None, budget=None, seed=None, **dask_data_to_scatter)[source]

Runs the target function with a configuration on a single instance-budget-seed combination (aka trial).

Parameters:
  • config (Configuration) – Configuration to be passed to the target function.

  • instance (str | None, defaults to None) – The Problem instance.

  • budget (float | None, defaults to None) – A positive, real-valued number representing an arbitrary limit to the target function handled by the target function internally.

  • seed (int, defaults to None)

Return type:

tuple[StatusType, float | list[float], float, dict]

Returns:

  • status (StatusType) – Status of the trial.

  • cost (float | list[float]) – Resulting cost(s) of the trial.

  • runtime (float) – The time the target function took to run.

  • additional_info (dict) – All further additional trial information.

submit_trial(trial_info, **dask_data_to_scatter)[source]

This function submits a configuration embedded in a trial_info object, and uses one of the workers to produce a result locally to each worker.

The execution of a configuration follows this procedure:

  1. The SMBO/intensifier generates a TrialInfo.

  2. SMBO calls submit_trial so that a worker launches the trial_info.

  3. submit_trial internally calls self.run(). It does so via a call to run_wrapper which contains common code that any run method will otherwise have to implement.

All results will be only available locally to each worker, so the main node needs to collect them.

Parameters:
  • trial_info (TrialInfo) – An object containing the configuration launched.

  • dask_data_to_scatter (dict[str, Any]) – When a user scatters data from their local process to the distributed network, this data is distributed in a round-robin fashion grouping by number of cores. Roughly speaking, we can keep this data in memory and then we do not have to (de-)serialize the data every time we would like to execute a target function with a big dataset. For example, when your target function has a big dataset shared across all the target function, this argument is very useful.

Return type:

None

wait()[source]

The SMBO/intensifier might need to wait for trials to finish before making a decision.

Return type:

None

exception smac.runner.FirstRunCrashedException[source]

Bases: TargetAlgorithmAbortException

Exception indicating that the first run crashed (depending on options this could trigger an ABORT of SMAC).

exception smac.runner.TargetAlgorithmAbortException[source]

Bases: Exception

Exception indicating that the target function suggests an ABORT of SMAC, usually because it assumes that all further runs will surely fail.

class smac.runner.TargetFunctionRunner(scenario, target_function, required_arguments=None)[source]

Bases: AbstractSerialRunner

Class to execute target functions which are python functions. Evaluates function for given configuration and resource limit.

The target function can either return a float (the loss), or a tuple with the first element being a float and the second being additional run information. In a multi-objective setting, the float value is replaced by a list of floats.

Parameters:
  • target_function (Callable) – The target function.

  • scenario (Scenario)

  • required_arguments (list[str], defaults to []) – A list of required arguments, which are passed to the target function.

__call__(config, algorithm, algorithm_kwargs)[source]

Calls the algorithm, which is processed in the run method.

Return type:

float | list[float] | dict[str, float] | tuple[float, dict] | tuple[list[float], dict] | tuple[dict[str, float], dict]

property meta: dict[str, Any]

Returns the meta-data of the created object.

run(config, instance=None, budget=None, seed=None, **dask_data_to_scatter)[source]

Calls the target function with pynisher if algorithm wall time limit or memory limit is set. Otherwise, the function is called directly.

Parameters:
  • config (Configuration) – Configuration to be passed to the target function.

  • instance (str | None, defaults to None) – The Problem instance.

  • budget (float | None, defaults to None) – A positive, real-valued number representing an arbitrary limit to the target function handled by the target function internally.

  • seed (int, defaults to None)

  • dask_data_to_scatter (dict[str, Any]) – This kwargs must be empty when we do not use dask! () When a user scatters data from their local process to the distributed network, this data is distributed in a round-robin fashion grouping by number of cores. Roughly speaking, we can keep this data in memory and then we do not have to (de-)serialize the data every time we would like to execute a target function with a big dataset. For example, when your target function has a big dataset shared across all the target function, this argument is very useful.

Return type:

tuple[StatusType, float | list[float], float, dict]

Returns:

  • status (StatusType) – Status of the trial.

  • cost (float | list[float]) – Resulting cost(s) of the trial.

  • runtime (float) – The time the target function took to run.

  • additional_info (dict) – All further additional trial information.

Modules

abstract_runner

abstract_serial_runner

dask_runner

exceptions

target_function_runner

target_function_script_runner