smac.model.random_forest.random_forest¶
Classes¶
|
Random forest that takes instance features into account. |
Interfaces¶
- class smac.model.random_forest.random_forest.RandomForest(configspace, n_trees=10, n_points_per_tree=-1, ratio_features=0.8333333333333334, min_samples_split=3, min_samples_leaf=3, max_depth=1048576, eps_purity=1e-08, max_nodes=1048576, bootstrapping=True, log_y=False, instance_features=None, pca_components=7, seed=0)[source]¶
Bases:
AbstractRandomForest
Random forest that takes instance features into account.
- Parameters:
n_trees (int, defaults to N_TREES) – The number of trees in the random forest.
n_points_per_tree (int, defaults to -1) – Number of points per tree. If the value is smaller than 0, the number of samples will be used.
ratio_features (float, defaults to 5.0 / 6.0) – The ratio of features that are considered for splitting.
min_samples_split (int, defaults to 3) – The minimum number of data points to perform a split.
min_samples_leaf (int, defaults to 3) – The minimum number of data points in a leaf.
max_depth (int, defaults to 2**20) – The maximum depth of a single tree.
eps_purity (float, defaults to 1e-8) – The minimum difference between two target values to be considered.
max_nodes (int, defaults to 2**20) – The maximum total number of nodes in a tree.
bootstrapping (bool, defaults to True) – Enables bootstrapping.
log_y (bool, defaults to False) – The y values (passed to this random forest) are expected to be log(y) transformed. This will be considered during predicting.
instance_features (dict[str, list[int | float]] | None, defaults to None) – Features (list of int or floats) of the instances (str). The features are incorporated into the X data, on which the model is trained on.
pca_components (float, defaults to 7) – Number of components to keep when using PCA to reduce dimensionality of instance features.
seed (int) –
- property meta: dict[str, Any]¶
Returns the meta data of the created object.
- predict_marginalized(X)[source]¶
Predicts mean and variance marginalized over all instances.
Note
The method is random forest specific and follows the SMAC2 implementation. It requires no distribution assumption to marginalize the uncertainty estimates.
- Parameters:
X (np.ndarray [#samples, #hyperparameter + #features]) – Input data points.
- Return type:
tuple
[ndarray
,ndarray
]- Returns:
means (np.ndarray [#samples, 1]) – The predictive mean.
vars (np.ndarray [#samples, 1]) – The predictive variance.