cave.plot.algorithm_footprint module

class cave.plot.algorithm_footprint.AlgorithmFootprintPlotter(rh: smac.runhistory.runhistory.RunHistory, train_inst_feat, test_inst_feat, algorithms, cutoff=inf, output_dir=None, rng=None)[source]

Bases: object

Class that provides the algorithmic footprints after “Measuring algorithm footprints in instance space” (Kate Smith-Miles, Kate Smith-Miles)

General procedure:
  • label for each algorithm each instance with the same metric

  • map the instances onto a plane using pca

NOTE: The terms ‘algorithm’ and ‘config/configuration’ will be used synonymous throughout the class.

Parameters
  • rh (RunHistory) – runhistory to take cost from

  • test_inst_feat (train_inst_feat,) – instances names mapped to features

  • algorithms (List[Tuple(Configuration, str)]) – list with configs and descriptive names

  • cutoff (int) – cutoff (if available)

  • output_dir (str) – output directory

footprint(a, density_threshold, purity_threshold)[source]

Calculating the footprint within a portfolio using convex hulls that depend on density and purity thresholds. (algorithm 1 in Smith-Miles 2014)

We use 3 ways to refer to an instance here: name: the name (unique!) of the instance feat2d: the position as np.array tup: the tuple-version of feat2d (hashable…)

Parameters
  • a (Configuration) – configuration to get footprint of

  • density_threshold (float) – minimum density that regions must show to be merged

  • purity_threshold (float) – minimum purity (percentage of good instance) that regions must show to be merged

Returns

footprint – the size of all resulting convex hulls

Return type

float

get_clusters(features_2d)[source]

Mapping instances to clusters, using silhouette-scores to determine number of cluster.

Returns

paths – paths to plots

Return type

List[str]

plot3d()[source]

Plot 3d-version of the algorithm footprint from four different angles.

plot_interactive_footprint()[source]

Use bokeh to create an interactive algorithm footprint with zoom and hover tooltips. Should avoid problems with overplotting (since we can zoom) and provide better information about instances.

plot_points_per_cluster()[source]

Plot good versus bad for passed config per cluster.

Parameters
  • conf (Configuration) – configuration for which to plot good vs bad

  • out (str) – output path

Returns

outpaths – output paths per cluster

Return type

List[str]