cave.reader.csv2rh module

class cave.reader.csv2rh.CSV2RH[source]

Bases: object

create_cs_from_pandaframe(data)[source]
extract_configs(data, cs: ConfigSpace.configuration_space.ConfigurationSpace, id_to_config=None)[source]

After completion, every unique configuration in the data will have a corresponding id in the data-frame. The data-frame is expected to either contain a column for config-id OR columns for each individual hyperparameter. Parameter-names will be used from the provided configspace. If a mapping of ids to configurations already exists, it will be used.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing either a column called config_id or a column for every individual parameter
  • cs (ConfigurationSpace) – optional, if provided the parameters-argument will be ignored
  • id_to_config (dict[int:Configuration]) – optional, mapping ids to Configurations (necessary when using config_id-column)
Returns:

  • data (pd.DataFrame) – if no config-id-columns was there before, there is one now.
  • id_to_config (dict) – mapping every id to a configuration

extract_instances(data, feature_names, features)[source]

After completion, every unique instance in the data will have a corresponding id in the data-frame. The data-frame is expected to either contain a column for instance-id OR columns for each individual instance-feature. Parameter-names will be used from the provided configspace. If a mapping of ids to configurations already exists, it will be used.

Parameters:
  • data (pd.DataFrame) – pandas dataframe containing either a column called instance_id or a column for every individual instance-features
  • feature_names (list[str]) – optional, list of feature-names
  • features (dict[int:np.array]) – optional, mapping ids to instance-feature vectors (necessary when using instance_id-column)
Returns:

  • data (pd.DataFrame) – if no instance_id-columns was there before, there is one now.
  • id_to_inst_feats (dict) – mapping every id to instance-features

read_csv_to_rh(data, cs: Union[None, str, ConfigSpace.configuration_space.ConfigurationSpace] = None, id_to_config: Union[None, dict] = None, train_inst: Union[None, str, list] = None, test_inst: Union[None, str, list] = None, instance_features: Union[None, str, dict] = None, logger=None, seed=42)[source]

Interpreting a .csv-file as runhistory. Valid values for the header of the csv-file/DataFrame are: [‘seed’, ‘cost’, ‘time’, ‘status’, ‘config_id’, ‘instance_id’] or any parameter- or instance-feature-names.

Parameters:
  • data (str or pd.DataFrame) – either string to csv-formatted runhistory-file or DataFrame containing the same information
  • cs (str or ConfigurationSpace) – config-space to use for this runhistory
  • id_to_config (dict) – mapping ids to Configuration-objects
  • train_inst (str or list[str]) – train instances or path to file
  • test_inst (str or list[str]) – test instances or path to file
  • instance_features (str or dict) – instance features as dict mapping instance-ids to feature-array or file to appropriately formatted instance-feature-file
  • Returns
  • --------
  • rh (RunHistory) – runhistory with all the runs from the csv-file