smac.runhistory.runhistory module

class smac.runhistory.runhistory.DataOrigin(value)[source]

Bases: enum.Enum

Definition of how data in the runhistory is used.

  • INTERNAL: internal data which was gathered during the current optimization run. It will be saved to disk, used for building EPMs and during intensify.

  • EXTERNAL_SAME_INSTANCES: external data, which was gathered by running

    another program on the same instances as the current optimization run runs on (for example pSMAC). It will not be saved to disk, but used both for EPM building and during intensify.

  • EXTERNAL_DIFFERENT_INSTANCES: external data, which was gathered on a

    different instance set as the one currently used, but due to having the same instance features can still provide useful information. Will not be saved to disk and only used for EPM building.

EXTERNAL_DIFFERENT_INSTANCES = 3
EXTERNAL_SAME_INSTANCES = 2
INTERNAL = 1
class smac.runhistory.runhistory.EnumEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: json.encoder.JSONEncoder

Custom encoder for enum-serialization (implemented for StatusType from tae). Using encoder implied using object_hook as defined in StatusType to deserialize from json.

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘) if indent is None and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

default(obj: object) → Any[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
class smac.runhistory.runhistory.InstSeedBudgetKey(instance, seed, budget)

Bases: tuple

Create new instance of InstSeedBudgetKey(instance, seed, budget)

_asdict()

Return a new OrderedDict which maps field names to their values.

_field_defaults = {}
_fields = ('instance', 'seed', 'budget')
_fields_defaults = {}
classmethod _make(iterable)

Make a new InstSeedBudgetKey object from a sequence or iterable

_replace(**kwds)

Return a new InstSeedBudgetKey object replacing specified fields with new values

property budget

Alias for field number 2

property instance

Alias for field number 0

property seed

Alias for field number 1

class smac.runhistory.runhistory.InstSeedKey(instance, seed)

Bases: tuple

Create new instance of InstSeedKey(instance, seed)

_asdict()

Return a new OrderedDict which maps field names to their values.

_field_defaults = {}
_fields = ('instance', 'seed')
_fields_defaults = {}
classmethod _make(iterable)

Make a new InstSeedKey object from a sequence or iterable

_replace(**kwds)

Return a new InstSeedKey object replacing specified fields with new values

property instance

Alias for field number 0

property seed

Alias for field number 1

class smac.runhistory.runhistory.RunHistory(overwrite_existing_runs: bool = False)[source]

Bases: object

Container for target algorithm run information.

Most importantly, the runhistory contains an efficient mapping from each evaluated configuration to the empirical cost observed on either the full instance set or a subset. The cost is the average over all observed costs for one configuration:

  • If using budgets for a single instance, only the cost on the highest observed budget is returned.

  • If using instances as the budget, the average cost over all evaluated instances is returned.

  • Theoretically, the runhistory object can handle instances and budgets at the same time. This is neither used nor tested.

  • Capped runs are not included in this cost.

Note

Guaranteed to be picklable.

data

TODO

Type

collections.OrderedDict()

config_ids

Maps config -> id

Type

dict

ids_config

Maps id -> config

Type

dict

num_runs_per_config

Maps config_id -> number of runs

Type

dict

Parameters

overwrite_existing_runs (bool (default=True)) – If set to True and a run of a configuration on an instance-budget-seed-pair already exists, it is overwritten.

Constructor

Parameters

overwrite_existing_runs (bool) – allows to overwrites old results if pairs of algorithm-instance-seed were measured multiple times

_add(k: smac.runhistory.runhistory.RunKey, v: smac.runhistory.runhistory.RunValue, status: smac.tae.StatusType, origin: smac.runhistory.runhistory.DataOrigin) → None[source]

Actual function to add new entry to data structures

TODO

_cost(config: ConfigSpace.configuration_space.Configuration, instance_seed_budget_keys: Optional[Iterable[smac.runhistory.runhistory.InstSeedBudgetKey]] = None) → List[float][source]

Return array of all costs for the given config for further calculations.

Parameters
  • config (Configuration) – Configuration to calculate objective for

  • instance_seed_budget_keys (list, optional (default=None)) – List of tuples of instance-seeds-budget keys. If None, the run_history is queried for all runs of the given configuration.

Returns

Costs – Array of all costs

Return type

list

add(config: ConfigSpace.configuration_space.Configuration, cost: float, time: float, status: smac.tae.StatusType, instance_id: Optional[str] = None, seed: Optional[int] = None, budget: float = 0.0, starttime: float = 0.0, endtime: float = 0.0, additional_info: Optional[Dict] = None, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.INTERNAL: 1>, force_update: bool = False) → None[source]

Adds a data of a new target algorithm (TA) run; it will update data if the same key values are used (config, instance_id, seed)

Parameters
  • config (dict (or other type -- depending on config space module)) – Parameter configuration

  • cost (float) – Cost of TA run (will be minimized)

  • time (float) – Runtime of TA run

  • status (str) – Status in {SUCCESS, TIMEOUT, CRASHED, ABORT, MEMOUT}

  • instance_id (str) – String representing an instance (default: None)

  • seed (int) – Random seed used by TA (default: None)

  • budget (float) – budget (cutoff) used in intensifier to limit TA (default: 0)

  • starttime (float) – starting timestamp of TA evaluation

  • endtime (float) – ending timestamp of TA evaluation

  • additional_info (dict) – Additional run infos (could include further returned information from TA or fields such as start time and host_id)

  • origin (DataOrigin) – Defines how data will be used.

  • force_update (bool (default: False)) – Forces the addition of a config to the history

average_cost(config: ConfigSpace.configuration_space.Configuration, instance_seed_budget_keys: Optional[Iterable[smac.runhistory.runhistory.InstSeedBudgetKey]] = None) → float[source]

Return the average cost of a configuration.

This is the mean of costs of all instance-seed pairs.

Parameters
  • config (Configuration) – Configuration to calculate objective for

  • instance_seed_budget_keys (list, optional (default=None)) – List of tuples of instance-seeds-budget keys. If None, the run_history is queried for all runs of the given configuration.

Returns

Cost – Average cost

Return type

float

compute_all_costs(instances: Optional[List[str]] = None) → None[source]

Computes the cost of all configurations from scratch and overwrites self.cost_perf_config and self.runs_per_config accordingly;

Note

This method is only used for merge_foreign_data and should be removed.

Parameters

instances (typing.List[str]) – list of instances; if given, cost is only computed wrt to this instance set

empty() → bool[source]

Check whether or not the RunHistory is empty.

Returns

emptiness – True if runs have been added to the RunHistory, False otherwise

Return type

bool

get_all_configs() → List[ConfigSpace.configuration_space.Configuration][source]

Return all configurations in this RunHistory object

Returns

parameter configurations

Return type

list

get_all_configs_per_budget(budget_subset: Optional[List] = None) → List[ConfigSpace.configuration_space.Configuration][source]

Return all configs in this RunHistory object that have been run on one of these budgets

budget_subset: list

Returns

parameter configurations

Return type

list

get_cost(config: ConfigSpace.configuration_space.Configuration) → float[source]

Returns empirical cost for a configuration.

See the class docstring for how the costs are computed. The costs are not re-computed, but are read from cache.

Parameters

config (Configuration) –

Returns

cost – Computed cost for configuration

Return type

float

get_instance_costs_for_config(config: ConfigSpace.configuration_space.Configuration) → Dict[str, List[float]][source]

Returns the average cost per instance (across seeds) for a configuration

If the runhistory contains budgets, only the highest budget for a configuration is returned.

Note

This is used by the pSMAC facade to determine the incumbent after the evaluation.

Parameters

config (Configuration from ConfigSpace) – Parameter configuration

Returns

cost_per_inst

Return type

dict<instance name<str>, cost<float>>

get_min_cost(config: ConfigSpace.configuration_space.Configuration) → float[source]

Returns the lowest empirical cost for a configuration, across all runs (budgets)

See the class docstring for how the costs are computed. The costs are not re-computed, but are read from cache.

Parameters

config (Configuration) –

Returns

min_cost – Computed cost for configuration

Return type

float

get_runs_for_config(config: ConfigSpace.configuration_space.Configuration, only_max_observed_budget: bool) → List[smac.runhistory.runhistory.InstSeedBudgetKey][source]

Return all runs (instance seed pairs) for a configuration.

Note

This method ignores capped runs.

Parameters
  • config (Configuration from ConfigSpace) – Parameter configuration

  • only_max_observed_budget (bool) – Select only the maximally observed budget run for this configuration

Returns

instance_seed_budget_pairs

Return type

list<tuples of instance, seed, budget>

incremental_update_cost(config: ConfigSpace.configuration_space.Configuration, cost: float) → None[source]

Incrementally updates the performance of a configuration by using a moving average;

Parameters
  • config (Configuration) – configuration to update cost based on all runs in runhistory

  • cost (float) – cost of new run of config

load_json(fn: str, cs: ConfigSpace.configuration_space.ConfigurationSpace) → None[source]

Load and runhistory in json representation from disk.

Overwrites current runhistory!

Parameters
  • fn (str) – file name to load from

  • cs (ConfigSpace) – instance of configuration space

min_cost(config: ConfigSpace.configuration_space.Configuration, instance_seed_budget_keys: Optional[Iterable[smac.runhistory.runhistory.InstSeedBudgetKey]] = None) → float[source]

Return the minimum cost of a configuration

This is the minimum cost of all instance-seed pairs.

Parameters
  • config (Configuration) – Configuration to calculate objective for

  • instance_seed_budget_keys (list, optional (default=None)) – List of tuples of instance-seeds-budget keys. If None, the run_history is queried for all runs of the given configuration.

Returns

min_cost – minimum cost of config

Return type

float

save_json(fn: str = 'runhistory.json', save_external: bool = False) → None[source]

saves runhistory on disk

Parameters
  • fn (str) – file name

  • save_external (bool) – Whether to save external data in the runhistory file.

sum_cost(config: ConfigSpace.configuration_space.Configuration, instance_seed_budget_keys: Optional[Iterable[smac.runhistory.runhistory.InstSeedBudgetKey]] = None) → float[source]

Return the sum of costs of a configuration.

This is the sum of costs of all instance-seed pairs.

Parameters
  • config (Configuration) – Configuration to calculate objective for

  • instance_seed_budget_keys (list, optional (default=None)) – List of tuples of instance-seeds-budget keys. If None, the run_history is queried for all runs of the given configuration.

Returns

sum_cost – Sum of costs of config

Return type

float

update(runhistory: smac.runhistory.runhistory.RunHistory, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.EXTERNAL_SAME_INSTANCES: 2>) → None[source]

Update the current runhistory by adding new runs from a RunHistory.

Parameters
  • runhistory (RunHistory) – Runhistory with additional data to be added to self

  • origin (DataOrigin) – If set to INTERNAL or EXTERNAL_FULL the data will be added to the internal data structure self._configid_to_inst_seed_budget and be available through get_runs_for_config().

update_cost(config: ConfigSpace.configuration_space.Configuration) → None[source]

Store the performance of a configuration across the instances in self.cost_per_config and also updates self.runs_per_config;

Note

This method ignores capped runs.

Parameters

config (Configuration) – configuration to update cost based on all runs in runhistory

update_from_json(fn: str, cs: ConfigSpace.configuration_space.ConfigurationSpace, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.EXTERNAL_SAME_INSTANCES: 2>) → None[source]

Update the current runhistory by adding new runs from a json file.

Parameters
  • fn (str) – File name to load from.

  • cs (ConfigSpace) – Instance of configuration space.

  • origin (DataOrigin) – What to store as data origin.

class smac.runhistory.runhistory.RunInfo(config: ConfigSpace.configuration_space.Configuration, instance: Optional[str], instance_specific: str, seed: int, cutoff: Optional[float], capped: bool, budget: float = 0.0, source_id: int = 0)[source]

Bases: smac.runhistory.runhistory.RunInfo

Create new instance of RunInfo(config, instance, instance_specific, seed, cutoff, capped, budget, source_id)

class smac.runhistory.runhistory.RunKey(config_id: int, instance_id: Optional[str], seed: Optional[int], budget: float = 0.0)[source]

Bases: smac.runhistory.runhistory.RunKey

Create new instance of RunKey(config_id, instance_id, seed, budget)

class smac.runhistory.runhistory.RunValue(cost, time, status, starttime, endtime, additional_info)

Bases: tuple

Create new instance of RunValue(cost, time, status, starttime, endtime, additional_info)

_asdict()

Return a new OrderedDict which maps field names to their values.

_field_defaults = {}
_fields = ('cost', 'time', 'status', 'starttime', 'endtime', 'additional_info')
_fields_defaults = {}
classmethod _make(iterable)

Make a new RunValue object from a sequence or iterable

_replace(**kwds)

Return a new RunValue object replacing specified fields with new values

property additional_info

Alias for field number 5

property cost

Alias for field number 0

property endtime

Alias for field number 4

property starttime

Alias for field number 3

property status

Alias for field number 2

property time

Alias for field number 1