smac.model.random_forest.random_forest

Classes

RandomForest(configspace[, n_trees, ...])

Random forest that takes instance features into account.

Interfaces

class smac.model.random_forest.random_forest.RandomForest(configspace, n_trees=10, n_points_per_tree=-1, ratio_features=0.8333333333333334, min_samples_split=3, min_samples_leaf=3, max_depth=1048576, eps_purity=1e-08, max_nodes=1048576, bootstrapping=True, log_y=False, instance_features=None, pca_components=7, seed=0)[source]

Bases: AbstractRandomForest

Random forest that takes instance features into account.

Parameters:
  • n_trees (int, defaults to N_TREES) – The number of trees in the random forest.

  • n_points_per_tree (int, defaults to -1) – Number of points per tree. If the value is smaller than 0, the number of samples will be used.

  • ratio_features (float, defaults to 5.0 / 6.0) – The ratio of features that are considered for splitting.

  • min_samples_split (int, defaults to 3) – The minimum number of data points to perform a split.

  • min_samples_leaf (int, defaults to 3) – The minimum number of data points in a leaf.

  • max_depth (int, defaults to 2**20) – The maximum depth of a single tree.

  • eps_purity (float, defaults to 1e-8) – The minimum difference between two target values to be considered.

  • max_nodes (int, defaults to 2**20) – The maximum total number of nodes in a tree.

  • bootstrapping (bool, defaults to True) – Enables bootstrapping.

  • log_y (bool, defaults to False) – The y values (passed to this random forest) are expected to be log(y) transformed. This will be considered during predicting.

  • instance_features (dict[str, list[int | float]] | None, defaults to None) – Features (list of int or floats) of the instances (str). The features are incorporated into the X data, on which the model is trained on.

  • pca_components (float, defaults to 7) – Number of components to keep when using PCA to reduce dimensionality of instance features.

  • seed (int) –

property meta: dict[str, Any]

Returns the meta data of the created object.

predict_marginalized(X)[source]

Predicts mean and variance marginalized over all instances.

Note

The method is random forest specific and follows the SMAC2 implementation. It requires no distribution assumption to marginalize the uncertainty estimates.

Parameters:

X (np.ndarray [#samples, #hyperparameter + #features]) – Input data points.

Return type:

tuple[ndarray, ndarray]

Returns:

  • means (np.ndarray [#samples, 1]) – The predictive mean.

  • vars (np.ndarray [#samples, 1]) – The predictive variance.