Options and file formats

In the optimization-process of SMAC, there are several ways to configure the options:

Mandatory:
  • Commandline-options, with which SMAC is called directly (not needed if SMAC is used within Python).

  • Scenario-options, that are specified via a Scenario-object. Either directly in the Python-code or by using a scenario-file.

  • A Parameter Configuration Space (PCS), that provides the legal ranges of parameters to optimize, their types (e.g. int or float) and their default values.

Optional:
  • Instance- and feature-files, that list the instances and features to optimize upon.

SMAC Options

The basic command line options are described in Basic Usage. The options are separated into three groups, Main Options, SMAC Options and Scenario Options. See the Main and SMAC Options below. Find the Scenario Options in the next section.

Main Options:

hydra_iterations

number of hydra iterations. Only active if mode is set to Hydra Default: 3.

hydra_validation

set to validate incumbents on. valX => validation set of size training_set * 0.X Default: train.

incumbents_per_round

number of configurations to keep per psmac/hydra iteration. Default: 1.

mode

Configuration mode. Default: SMAC4AC.

n_optimizers

number of optimizers to run in parallel per psmac/hydra iteration. Default: 1.

psmac_validate

Validate all psmac configurations.

random_configuration_chooser

path to a python module containing a class RandomConfigurationChooserImpl`implementing the interface of `RandomConfigurationChooser

restore_state

Path to directory with SMAC-files.

scenario_file

Scenario file in AClib format.

seed

Random Seed. Default: 1.

verbose_level

Verbosity level. Default: 20.

SMAC Options:

abort_on_first_run_crash

If true, SMAC will abort if the first run of the target algorithm crashes. Default: True.

acq_opt_challengers

Number of challengers returned by acquisition function optimization. Also influences the number of randomly sampled configurations to optimized the acquisition function Default: 5000.

always_race_default

Race new incumbents always against default configuration.

hydra_iterations

number of hydra iterations. Only active if mode is set to Hydra Default: 3.

input_psmac_dirs

For parallel SMAC, multiple output-directories are used.

intens_adaptive_capping_slackfactork

Slack factor of adpative capping (factor * adpative cutoff). Only active if obj is runtime. If set to very large number it practically deactivates adaptive capping. Default: 1.2.

intens_min_chall

Minimal number of challengers to be considered in each intensification run (> 1). Set to 1 and in combination with very small intensification-percentage. it will deactivate randomly sampled configurations (and hence, extrapolation of random forest will be an issue.) Default: 2.

intensification_percentage

The fraction of time to be used on intensification (versus choice of next Configurations). Default: 0.5.

limit_resources

If true, SMAC will use pynisher to limit time and memory for the target algorithm. Allows SMAC to use all resources available. Applicable only to func TAEs. Set to ‘True’ by default. (Use with caution!) Default: True.

maxR

Maximum number of calls per configuration. Default: 2000.

minR

Minimum number of calls per configuration. Default: 1.

output_dir

Specifies the output-directory for all emerging files, such as logging and results. Default: smac3-output_2020-08-05_14:41:04_348576.

rand_prob

probablity to run a random configuration instead of configuration optimized on the acquisition function Default: 0.5.

random_configuration_chooser

path to a python module containing a class`RandomConfigurationChooserImpl` implementingthe interface of RandomConfigurationChooser

rf_do_bootstrapping

Use bootstraping in random forest. Default: True.

rf_max_depth

Maximum depth of each tree in the random forest. Default: 20.

rf_min_samples_leaf

Minimum required number of samples in each leaf of a tree in the random forest. Default: 3.

rf_min_samples_split

Minimum number of samples to split for building a tree in the random forest. Default: 3.

rf_num_trees

Number of trees in the random forest (> 1). Default: 10.

rf_ratio_features

Ratio of sampled features in each split ([0.,1.]). Default: 0.8333333333333334.

shared_model

Whether to run SMAC in parallel mode.

sls_max_steps

Maximum number of local search steps in one iteration during the optimization of the acquisition function.

sls_n_steps_plateau_walk

Maximum number of steps on plateaus during the optimization of the acquisition function. Default: 10.

transform_y

Transform all observed cost values via log-transformations or inverse scaling. The subfix “s” indicates that SMAC scales the y-values accordingly to apply the transformation. Default: NONE.

use_ta_time

Instead of measuring SMAC’s wallclock time, only consider time reported by the target algorithm (ta).

Scenario

The scenario-object (smac.scenario.scenario.Scenario) is used to configure SMAC and can be constructed either by providing an actual scenario-object (see SVM-example), or by specifing the options in a scenario file (see SPEAR example).

The format of the scenario file is one option per line:

OPTION1 = VALUE1
OPTION2 = VALUE2
...

For boolean options “1” or “true” both evaluate to True. The following assumes that the scenario is created via a scenario-file. If it is generated within custom code, you might not need algo or paramfile.

Scenario Options:

algo_runs_timelimit

Maximum amount of CPU-time used for optimization. Default: inf.

always_race_default

Race new incumbents always against default configuration.

cost_for_crash

Defines the cost-value for crashed runs on scenarios with quality as run-obj. Default: 2147483647.0.

cutoff

Maximum runtime, after which the target algorithm is cancelled. Required if *run_obj* is runtime.

deterministic

If true, SMAC assumes that the target function or algorithm is deterministic (the same static seed of 0 is always passed to the function/algorithm). If false, different random seeds are passed to the target function/algorithm.

execdir

Specifies the path to the execution-directory. Default: ..

feature_fn

Specifies the file with the instance-features.

initial_incumbent

DEFAULT is the default from the PCS. Default: DEFAULT.

memory_limit

Maximum available memory the target algorithm can occupy before being cancelled in MB.

overall_obj

PARX, where X is an integer defining the penalty imposed on timeouts (i.e. runtimes that exceed the cutoff-time). Default: par10.

pcs_fn

Specifies the path to the PCS-file.

run_obj

Defines what metric to optimize. When optimizing runtime, cutoff_time is required as well.

ta

Specifies the target algorithm call that SMAC will optimize. Interpreted as a bash-command.

ta_run_limit

Maximum number of algorithm-calls during optimization. Default: inf.

test_inst_fn

Specifies the file with the test-instances.

train_inst_fn

Specifies the file with the training-instances.

wallclock_limit

Maximum amount of wallclock-time used for optimization. Default: inf.

These options are also available as command line switches: Prepend two “-” and replace each “_” by “-“, e.g. “wallclock_limit” becomes “–wallclock-limit”. The options on the command line overwrite the values given in the scenario file.

Parameter Configuration Space (PCS)

The Parameter Configuration Space (PCS) defines the legal ranges of the parameters to be optimized and their default values. In the examples-folder you can find several examples for PCS-files. Generally, the format is:

To define parameters and their ranges, the following format is supported:

parameter_name categorical {value_1, ..., value_N} [default value]
parameter_name ordinal {value_1, ..., value_N} [default value]
parameter_name integer [min_value, max_value] [default value]
parameter_name integer [min_value, max_value] [default value] log
parameter_name real [min_value, max_value] [default value]
parameter_name real [min_value, max_value] [default value] log

The trailing “log” indicates that SMAC should sample from the defined ranges on a log scale.

Furthermore, conditional dependencies can be expressed. That is useful if a parameter activates sub-parameters. For example, only if a certain heuristic is used, the heuristic’s parameter are active and otherwise SMAC can ignore these.

# Conditionals:
child_name | condition [&&,||] condition ...

# Condition Operators:
# parent_x [<, >] parent_x_value (if parameter type is ordinal, integer or real)
# parent_x [==,!=] parent_x_value (if parameter type is categorical, ordinal or integer)
# parent_x in {parent_x_value1, parent_x_value2,...}

Forbidden constraints allow for specifications of forbidden combinations of parameter values. Please note that SMAC uses a simple rejection sampling strategy. Therefore, SMAC cannot handle efficiently highly constrained spaces.

# Forbiddens:
{parameter_name_1=value_1, ..., parameter_name_N=value_N}

Instances and Features

To specify instances and features, simply provide text-files in the following format and provide the paths to the instances in the scenario.

Instance-files are text-files with one instance per line. If you want to use training- and test-sets, separate files are expected.

Feature-files are files following the comma-separated-value-format, as can also be seen in the SPEAR-example:

instance

name of feature 1

name of feature 2

name of instance 1

value of feature 1

value of feature 2