smac.runhistory.runhistory module

class smac.runhistory.runhistory.DataOrigin[source]

Bases: enum.Enum

Definition of how data in the runhistory is used.

  • INTERNAL: internal data which was gathered during the current optimization run. It will be saved to disk, used for building EPMs and during intensify.
  • EXTERNAL_SAME_INSTANCES: external data, which was gathered by running
    another program on the same instances as the current optimization run runs on (for example pSMAC). It will not be saved to disk, but used both for EPM building and during intensify.
  • EXTERNAL_DIFFERENT_INSTANCES: external data, which was gathered on a
    different instance set as the one currently used, but due to having the same instance features can still provide useful information. Will not be saved to disk and only used for EPM building.
EXTERNAL_DIFFERENT_INSTANCES = 3
EXTERNAL_SAME_INSTANCES = 2
INTERNAL = 1
class smac.runhistory.runhistory.EnumEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: json.encoder.JSONEncoder

Custom encoder for enum-serialization (implemented for StatusType from tae/execute_ta_run). Using encoder implied using object_hook as defined in StatusType to deserialize from json.

Constructor for JSONEncoder, with sensible defaults.

If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, float or None. If skipkeys is True, such items are simply skipped.

If ensure_ascii is true, the output is guaranteed to be str objects with all incoming non-ASCII characters escaped. If ensure_ascii is false, the output can contain non-ASCII characters.

If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place.

If allow_nan is true, then NaN, Infinity, and -Infinity will be encoded as such. This behavior is not JSON specification compliant, but is consistent with most JavaScript based encoders and decoders. Otherwise, it will be a ValueError to encode such floats.

If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis.

If indent is a non-negative integer, then JSON array elements and object members will be pretty-printed with that indent level. An indent level of 0 will only insert newlines. None is the most compact representation.

If specified, separators should be an (item_separator, key_separator) tuple. The default is (‘, ‘, ‘: ‘) if indent is None and (‘,’, ‘: ‘) otherwise. To get the most compact JSON representation, you should specify (‘,’, ‘:’) to eliminate whitespace.

If specified, default is a function that gets called for objects that can’t otherwise be serialized. It should return a JSON encodable version of the object or raise a TypeError.

default(obj)[source]
encode(o)

Return a JSON string representation of a Python data structure.

>>> from json.encoder import JSONEncoder
>>> JSONEncoder().encode({"foo": ["bar", "baz"]})
'{"foo": ["bar", "baz"]}'
item_separator = ', '
iterencode(o, _one_shot=False)

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)
key_separator = ': '
class smac.runhistory.runhistory.InstSeedKey(instance, seed)

Bases: tuple

Create new instance of InstSeedKey(instance, seed)

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

instance

Alias for field number 0

seed

Alias for field number 1

class smac.runhistory.runhistory.RunHistory(aggregate_func: typing.Callable, overwrite_existing_runs: bool = False)[source]

Bases: object

Container for target algorithm run information.

Note: Guaranteed to be picklable.

data

collections.OrderedDict() – TODO

config_ids

dict – Maps config -> id

ids_config

dict – Maps id -> config

cost_per_config

dict – Maps config_id -> cost

runs_per_config

dict – Maps config_id -> number of runs

aggregate_func
overwrite_existing_runs

Constructor

Parameters:
  • aggregate_func (callable) – function to aggregate perf across instances
  • overwrite_existing_runs (bool) – allows to overwrites old results if pairs of algorithm-instance-seed were measured multiple times
add(config: ConfigSpace.configuration_space.Configuration, cost: float, time: float, status: smac.tae.execute_ta_run.StatusType, instance_id: str = None, seed: int = None, additional_info: dict = None, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.INTERNAL: 1>)[source]

Adds a data of a new target algorithm (TA) run; it will update data if the same key values are used (config, instance_id, seed)

Parameters:
  • config (dict (or other type -- depending on config space module)) – Parameter configuration
  • cost (float) – Cost of TA run (will be minimized)
  • time (float) – Runtime of TA run
  • status (str) – Status in {SUCCESS, TIMEOUT, CRASHED, ABORT, MEMOUT}
  • instance_id (str) – String representing an instance (default: None)
  • seed (int) – Random seed used by TA (default: None)
  • additional_info (dict) – Additional run infos (could include further returned information from TA or fields such as start time and host_id)
  • origin (DataOrigin) – Defines how data will be used.
compute_all_costs(instances: typing.List[str] = None)[source]

Computes the cost of all configurations from scratch and overwrites self.cost_perf_config and self.runs_per_config accordingly;

Parameters:instances (typing.List[str]) – list of instances; if given, cost is only computed wrt to this instance set
empty()[source]

Check whether or not the RunHistory is empty.

Returns:emptiness – True if runs have been added to the RunHistory, False otherwise
Return type:bool
get_all_configs()[source]

Return all configurations in this RunHistory object

Returns:parameter configurations
Return type:list
get_cost(config: ConfigSpace.configuration_space.Configuration)[source]

Returns empirical cost for a configuration; uses self.cost_per_config

Parameters:config (Configuration) –
Returns:cost – Computed cost for configuration
Return type:float
get_runs_for_config(config: ConfigSpace.configuration_space.Configuration)[source]

Return all runs (instance seed pairs) for a configuration.

Parameters:config (Configuration from ConfigSpace) – Parameter configuration
Returns:instance_seed_pairs
Return type:list<tuples of instance, seed>
incremental_update_cost(config: ConfigSpace.configuration_space.Configuration, cost: float)[source]

Incrementally updates the performance of a configuration by using a moving average;

Parameters:
  • config (Configuration) – configuration to update cost based on all runs in runhistory
  • cost (float) – cost of new run of config
load_json(fn: str, cs: ConfigSpace.configuration_space.ConfigurationSpace)[source]

Load and runhistory in json representation from disk.

Overwrites current runhistory!

Parameters:
  • fn (str) – file name to load from
  • cs (ConfigSpace) – instance of configuration space
save_json(fn: str = 'runhistory.json', save_external: bool = False)[source]

saves runhistory on disk

Parameters:
  • fn (str) – file name
  • save_external (bool) – Whether to save external data in the runhistory file.
update(runhistory: smac.runhistory.runhistory.RunHistory, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.EXTERNAL_SAME_INSTANCES: 2>)[source]

Update the current runhistory by adding new runs from a RunHistory.

Parameters:
  • runhistory (RunHistory) – Runhistory with additional data to be added to self
  • origin (DataOrigin) – If set to INTERNAL or EXTERNAL_FULL the data will be added to the internal data structure self._configid_to_inst_seed and be available through get_runs_for_config().
update_cost(config: ConfigSpace.configuration_space.Configuration)[source]

Store the performance of a configuration across the instances in self.cost_perf_config and also updates self.runs_per_config; uses self.aggregate_func

Parameters:config (Configuration) – configuration to update cost based on all runs in runhistory
update_from_json(fn: str, cs: ConfigSpace.configuration_space.ConfigurationSpace, origin: smac.runhistory.runhistory.DataOrigin = <DataOrigin.EXTERNAL_SAME_INSTANCES: 2>)[source]

Update the current runhistory by adding new runs from a json file.

Parameters:
  • fn (str) – File name to load from.
  • cs (ConfigSpace) – Instance of configuration space.
  • origin (DataOrigin) – What to store as data origin.
class smac.runhistory.runhistory.RunKey(config_id, instance_id, seed)

Bases: tuple

Create new instance of RunKey(config_id, instance_id, seed)

config_id

Alias for field number 0

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

instance_id

Alias for field number 1

seed

Alias for field number 2

class smac.runhistory.runhistory.RunValue(cost, time, status, additional_info)

Bases: tuple

Create new instance of RunValue(cost, time, status, additional_info)

additional_info

Alias for field number 3

cost

Alias for field number 0

count(value) → integer -- return number of occurrences of value
index(value[, start[, stop]]) → integer -- return first index of value.

Raises ValueError if the value is not present.

status

Alias for field number 2

time

Alias for field number 1