cave.plot.configurator_footprint module

class cave.plot.configurator_footprint.ConfiguratorFootprintPlotter(scenario: smac.scenario.scenario.Scenario, rhs: smac.runhistory.runhistory.RunHistory, incs: list = None, final_incumbent=None, rh_labels=None, max_plot: int = -1, contour_step_size=0.2, use_timeslider: bool = False, num_quantiles: int = 10, timeslider_log: bool = True, rng=None, output_dir: str = None)[source]

Bases: object

Creating an interactive plot, visualizing the configuration search space. The runhistories are correlated to the individual runs. Each run consists of a runhistory (in the smac-format), a list of incumbents If the dict “additional_info” in the RunValues of the runhistory contains a nested dict with additional_info[“timestamps”][“finished”], using those timestamps to sort data

Parameters
  • scenario (Scenario) – scenario

  • rhs (List[RunHistory]) – runhistories from configurator runs, only data collected during optimization (no validation!)

  • incs (List[List[Configuration]]) – incumbents per run, last entry is final incumbent

  • final_incumbent (Configuration) – final configuration (best of all runs)

  • max_plot (int) – maximum number of configs to plot, if -1 plot all

  • contour_step_size (float) – step size of meshgrid to compute contour of fitness landscape

  • use_timeslider (bool) – whether or not to have a time_slider-widget on cfp-plot INCREASES FILE-SIZE DRAMATICALLY

  • num_quantiles (int) – number of quantiles for the slider/ number of static pictures

  • timeslider_log (bool) – whether to use a logarithmic scale for the timeslider/quantiles

  • rng (np.random.RandomState) – random number generator

  • output_dir (str) – output directory

_contour_radiobuttongroup(contour_data, color_mapper)[source]
Returns

  • radiobuttongroup (RadioButtonGroup) – radiobuttongroup widget to select one of the elements

  • title (Div) – text-element to “show title” of widget

_create_figure(x_range, y_range)[source]
_create_views(source, used_configs)[source]

Create views in order of plotting, so more interesting views are plotted on top. Order of interest: default > final-incumbent > incumbent > candidate

local > random

num_runs (ascending, more evaluated -> more interesting)

Individual views are necessary, since bokeh can only plot one marker-type (circle, triangle, …) per ‘scatter’-call

Parameters
  • source (ColumnDataSource) – containing relevant information for plotting

  • used_configs (List[Configuration]) – configs that are contained in this source. necessary to plot glyphs for the independent runs so they can be toggled. not all configs are in every source because of efficiency: no need to have 0-runs configs

Returns

  • views (List[CDSView]) – views in order of plotting

  • views_by_run (Dict[ConfiguratorRun -> List[int]]) – maps each run to a list of indices of the related glyphs in the returned ‘views’-list

  • markers (List[string]) – markers (to the view with the same index)

_get_color(types)[source]

Determine appropriate color for all configurations

types: List[str]

type of configuration

colors: list

list of color per config

_get_runs_per_config_quantiled(rh, conf_list, quantiles)[source]

Returns a list of lists, each sublist representing the current state at that timestep (quantile). The current state means a list of times each config was evaluated at that timestep.

Parameters
  • rh (RunHistory) – rh to be split up

  • conf_list (list) – list of all Configuration objects that appear in runhistory

  • quantiles (int) – number of fractions to split rh into

  • Returns

  • --------

  • labels (List[str]) – labels for timeslider (i.e. wallclock-times)

  • runs_per_quantile (np.array) – numpy array of runs per configuration per quantile

_get_size(r_p_c)[source]

Returns size of scattered points in dependency of runs per config

Parameters

r_p_c (list[int]) – list with runs per config in order of self.conf_list

Returns

sizes – list with appropriate sizes for dots

Return type

list[int]

_get_widgets(all_glyphs, overtime_groups, run_groups, slider_labels=None)[source]

Combine timeslider for quantiles and checkboxes for individual runs in a single javascript-snippet

Parameters
  • all_glyphs (List[Glyph]) – togglable bokeh-glyphs

  • run_groups (overtime_groups,) – mapping labels to indices of the all_glyphs-list

  • slider_labels (Union[None, List[str]]) – if provided, used as labels for timeslider-widget

Returns

  • time_slider, checkbox, select_all, select_none (Widget) – desired interlayed bokeh-widgets

  • checkbox_title (Div) – text-element to “show title” of checkbox

_plot_contour(p, contour_data, x_range, y_range)[source]

Plot contour data.

Parameters
  • p (bokeh.plotting.figure) – figure to be drawn upon

  • contour_data (Dict[str -> np.array]) – dict from labels to array with contour data

  • x_range (List[float, float]) – min and max of x-axis

  • y_range (List[float, float]) – min and max of y-axis

Returns

handles – mapping from label to image glyph and min/max-tuple

Return type

dict[str -> tuple(ImageGlyph, tuple(float, float))]

_plot_get_source(conf_list, runs, X, inc_list, hp_names)[source]

Create ColumnDataSource with all the necessary data Contains for each configuration evaluated on any run:

  • all parameters and values

  • origin (if conflicting, origin from best run counts)

  • type (default, incumbent or candidate)

  • # of runs

  • size

  • color

Parameters
  • conf_list (list[Configuration]) – configurations

  • runs (list[int]) – runs per configuration (same order as conf_list)

  • X (np.array) – configuration-parameters as 2-dimensional array

  • inc_list (list[Configuration]) – incumbents for this conf-run

  • hp_names (list[str]) – names of hyperparameters

Returns

  • source (ColumnDataSource) – source with attributes as requested

  • conf_list (List[Configuration]) – filtered conf_list with only configs we actually plot (i.e. > 0 runs)

_scatter(p, source, views, markers)[source]
Parameters
  • p (bokeh.plotting.figure) – figure

  • source (ColumnDataSource) – data container

  • views (List[CDSView]) – list with views to be plotted (in order!)

  • markers (List[str]) – corresponding markers to the views

Returns

scatter_handles – glyph renderer per view

Return type

List[GlyphRenderer]

get_conf_matrix(rh, incs)[source]

Iterates through runhistory to get a matrix of configurations (in vector representation), a list of configurations and the number of runs per configuration in a quantiled manner.

Parameters
  • rh (RunHistory) – smac.runhistory

  • incs (List[List[Configuration]]) – incumbents of configurator runs, last entry is final incumbent

Returns

  • conf_matrix (np.array) – matrix of configurations in vector representation

  • conf_list (np.array) – list of all Configuration objects that appeared in runhistory the order of this list is used to determine all kinds of properties in the plotting (but is arbitrarily determined)

  • runs_per_quantile (np.array) – numpy array of runs per configuration per quantile

  • labels (List[str]) – labels for timeslider (i.e. wallclock-times)

get_depth(cs: ConfigSpace.configuration_space.ConfigurationSpace, param: str)[source]

Get depth in configuration space of a given parameter name breadth search until reaching a leaf for the first time

Parameters
  • cs (ConfigurationSpace) – ConfigurationSpace to get parents of a parameter

  • param (str) – name of parameter to inspect

get_distance(conf_matrix, cs: ConfigSpace.configuration_space.ConfigurationSpace)[source]

Computes the distance between all pairs of configurations.

Parameters
  • conf_matrx (np.array) – numpy array with cols as parameter values

  • cs (ConfigurationSpace) – ConfigurationSpace to get conditionalities

Returns

dists – np.array with distances between configurations i,j in dists[i,j] or dists[j,i]

Return type

np.array

get_mds(dists)[source]

Compute multi-dimensional scaling (using sklearn MDS) – nonlinear scaling

Parameters

dists (np.array) – full matrix of distances between all configurations

Returns

scaled coordinates in 2-dim room

Return type

np.array

get_pred_surface(rh, X_scaled, conf_list: list, contour_step_size)[source]

fit epm on the scaled input dimension and return data to plot a contour plot of the empirical performance

Parameters
  • rh (RunHistory) – runhistory

  • X_scaled (np.array) – configurations in scaled 2dim

  • conf_list (list) – list of Configuration objects

  • contour_step_size (float) – step-size for contour

Returns

contour_data – x, y, Z for contour plots

Return type

(np.array, np.array, np.array)

plot(X, conf_list: list, runs_per_quantile, inc_list: list = None, contour_data=None, use_timeslider=False, use_checkbox=True, timeslider_labels=None)[source]

plots sampled configuration in 2d-space; uses bokeh for interactive plot saves results in self.output, if set

Parameters
  • X (np.array) – np.array with 2-d coordinates for each configuration

  • conf_list (list) – list of ALL configurations in the same order as X

  • runs_per_quantile (list[np.array]) – configurator-run to be analyzed, as a np.array with the number of target-algorithm-runs per config per quantile.

  • inc_list (list) – list of incumbents (Configuration)

  • contour_data (list) – contour data (xx,yy,Z)

  • use_timeslider (bool) – whether or not to have a time_slider-widget on cfp-plot INCREASES FILE-SIZE DRAMATICALLY

  • use_checkbox (bool) – have checkboxes to toggle individual runs

Returns

  • (script, div) (str) – script and div of the bokeh-figure

  • over_time_paths (List[str]) – list with paths to the different quantiled timesteps of the configurator run (for static evaluation)

reduce_runhistory(rh: smac.runhistory.runhistory.RunHistory, max_configs: int, keep=None)[source]

Reduce configs to desired number, by default just drop the configs with the fewest runs.

Parameters
  • rh (RunHistory) – runhistory that is to be reduced

  • max_configs (int) – if > -1 reduce runhistory to at most max_configs

  • keep (List[Configuration]) – list of configs that should be kept for sure (e.g. default, incumbents)

Returns

rh – reduced runhistory

Return type

RunHistory

run()[source]

Uses available Configurator-data to perform a MDS, estimate performance data and plot the configurator footprint.