smac.tae.dask_runner module

class smac.tae.dask_runner.DaskParallelRunner(single_worker: smac.tae.base.BaseRunner, n_workers: int, patience: int = 5, output_directory: Optional[str] = None, dask_client: Optional[distributed.client.Client] = None)[source]

Bases: smac.tae.base.BaseRunner

Interface to submit and collect a job in a distributed fashion.

DaskParallelRunner is intended to comply with the bridge design pattern.

Nevertheless, to reduce the amount of code within single-vs-parallel implementations, DaskParallelRunner wraps a BaseRunner object which is then executed in parallel on n_workers.

This class then is constructed by passing a BaseRunner that implements a run() method, and is capable of doing so in a serial fashion. Then, this wrapper class called DaskParallelRunner uses dask to initialize N number of BaseRunner that actively wait of a RunInfo to produce a RunValue object.

To be more precise, the work model is then: 1- The smbo.intensifier dictates “what” to run (a configuration/instance/seed)

via a RunInfo object.

2- a tae_runner takes this RunInfo object and launches the task via

tae_runner.submit_run(). In the case of DaskParallelRunner, n_workers receive a pickle-object of DaskParallelRunner.single_worker, each with a run() method coming from DaskParallelRunner.single_worker.run()

3- RunInfo objects are run in a distributed fashion, an their results are

available locally to each worker. Such result is collected by DaskParallelRunner.get_finished_runs() and then passed to the SMBO.

4- Exceptions are also locally available to each worker and need to be

collected.

Dask works with Future object which are managed via the DaskParallelRunner.client.

results
ta
stats
run_obj
par_factor
cost_for_crash
abort_i_first_run_crash
n_workers
futures
client
Parameters
  • single_worker (BaseRunner) – A runner to run in a distributed fashion

  • n_workers (int) – Number of workers to use for distributed run. Will be ignored if dask_client is not None.

  • patience (int) – How much to wait for workers to be available if one fails

  • output_directory (str, optional) – If given, this will be used for the dask worker directory and for storing server information. If a dask client is passed, it will only be used for storing server information as the worker directory must be set by the program/user starting the workers.

  • dask_client (dask.distributed.Client) – User-created dask client, can be used to start a dask cluster and then attach SMAC to it.

_abc_impl = <_abc_data object>
_extract_completed_runs_from_futures() → None[source]

A run is over, when a future has done() equal true. This function collects the completed futures and move them from self.futures to self.results.

*** We make sure futures never exceed the capacity of the scheduler

_workers_available() → bool[source]

“Query if there are workers available, which means that there are resources to launch a dask job

get_finished_runs() → List[Tuple[smac.runhistory.runhistory.RunInfo, smac.runhistory.runhistory.RunValue]][source]

This method returns any finished configuration, and returns a list with the results of exercising the configurations. This class keeps populating results to self.results until a call to get_finished runs is done. In this case, the self.results list is emptied and all RunValues produced by running run() are returned.

Returns

  • List[RunInfo, RunValue] (A list of RunValues (and respective RunInfo), that is,) – the results of executing a run_info

  • a submitted configuration

num_workers() → int[source]

Total number of workers available. This number is dynamic as more resources can be allocated

pending_runs() → bool[source]

Whether or not there are configs still running. Generally if the runner is serial, launching a run instantly returns it’s result. On parallel runners, there might be pending configurations to complete.

run(config: ConfigSpace.configuration_space.Configuration, instance: str, cutoff: Optional[float] = None, seed: int = 12345, budget: Optional[float] = None, instance_specific: str = '0') → Tuple[smac.tae.StatusType, float, float, Dict][source]

This method only complies with the abstract parent class. In the parallel case, we call the single worker run() method

Parameters
  • config (Configuration) – dictionary param -> value

  • instance (string) – problem instance

  • cutoff (float, optional) – Wallclock time limit of the target algorithm. If no value is provided no limit will be enforced.

  • seed (int) – random seed

  • budget (float, optional) – A positive, real-valued number representing an arbitrary limit to the target algorithm. Handled by the target algorithm internally

  • instance_specific (str) – instance specific information (e.g., domain file or solution)

Returns

  • status (enum of StatusType (int)) – {SUCCESS, TIMEOUT, CRASHED, ABORT}

  • cost (float) – cost/regret/quality (float) (None, if not returned by TA)

  • runtime (float) – runtime (None if not returned by TA)

  • additional_info (dict) – all further additional run information

submit_run(run_info: smac.runhistory.runhistory.RunInfo) → None[source]

This function submits a configuration embedded in a run_info object, and uses one of the workers to produce a result locally to each worker.

The execution of a configuration follows this procedure: 1- SMBO/intensifier generates a run_info 2- SMBO calls submit_run so that a worker launches the run_info 3- submit_run internally calls self.run(). it does so via a call to self.run_wrapper() which contains common code that any run() method will otherwise have to implement, like capping check.

Child classes must implement a run() method. All results will be only available locally to each worker, so the main node needs to collect them.

Parameters

run_info (RunInfo) – An object containing the configuration and the necessary data to run it

wait() → None[source]

SMBO/intensifier might need to wait for runs to finish before making a decision. This class waits until 1 run completes