cave.analyzer.performance_table module

class cave.analyzer.performance_table.PerformanceTable(runscontainer)[source]

Bases: cave.analyzer.base_analyzer.BaseAnalyzer

If the run-objective is ‘runtime’: PAR stands for Penalized Average Runtime. If there is a timeout in the scenario, runs that were thus cut off can be penalized with a factor (because we do not know how long it would have run). PAR1 is no penalty, PAR10 will count all cutoffs with a factor of 10.

For timeouts: if there are multiple runs on the same configuration-instance pair (with different seeds), some resulting in timeouts and some not, the majority decides here.

P-value (between 0 and 1) results from comparing default and incumbent using a paired permutation test with 10000 iterations (permuting instances) and tests against the null-hypothesis that the mean of performance between default and incumbent is equal.

Oracle performance searches for the best single run per instance (so the best seed/configuration-pair that was seen) and aggregates over them.

runscontainer: RunsContainer contains all important information about the configurator runs

_paired_t_test(epm_rh, default, incumbent, num_permutations)[source]
_permutation_test(epm_rh, default, incumbent, num_permutations, par=1)[source]
classmethod check_for_bokeh(d)
create_performance_table(default, incumbent, epm_rh, oracle)[source]

Create table, compare default against incumbent on train-, test- and combined instances. Listing PAR10, PAR1 and timeouts. Distinguishes between train and test, if available.

get_html(d=None, tooltip=None) → Tuple[str, str]

General reports in html-format, to be easily integrated in html-code. ALSO FOR BOKEH-OUTPUT.


d (Dictionary) – a dictionary that will be later turned into a website


script, div – header and body part of html-code

Return type

str, str


Depending on analysis, this creates jupyter-notebook compatible output.

get_oracle(instances, rh)[source]

Estimation of oracle performance. Collects best performance seen for each instance in any run.

  • instances (List[str]) – list of instances in question

  • rh (RunHistory or List[RunHistory]) – runhistory or list of runhistories (will be combined)

  • Results

  • -------

  • oracle (dict[str->float]) – best seen performance per instance {inst : performance}

get_parX(cost_dict, par=10)[source]

Calculate parX-values from given cost_dict. First determine PAR-timeouts for each run on each instances, Second average over train/test if available, else just average.

  • cost_dict (Dict[inst->cost]) – mapping instances to costs

  • par (int) – par-factor to use


PAR10 values for train- and test-instances, if available as tuple else the general average

Return type

(train, test) OR average – tuple<float, float> OR float

get_performance_table(instances: List[str], validated_rh: smac.runhistory.runhistory.RunHistory, default: ConfigSpace.configuration_space.Configuration, incumbent: ConfigSpace.configuration_space.Configuration, epm_rh: smac.runhistory.runhistory.RunHistory, scenario: smac.scenario.scenario.Scenario)[source]

This function needs to be called if bokeh-plots are to be displayed in notebook AND saved to webpage.


Get number of timeouts in config


timeouts (dict[i -> bool]) – mapping instances to whether timeout was on that instance


timeouts – tuple (timeouts, total runs)

Return type

tuple(int, int)