amltk.pipeline.parsers.configspace #

ConfigSpace is a library for representing and sampling configurations for hyperparameter optimization. It features a straightforward API for defining hyperparameters, their ranges and even conditional dependencies.

It is generally flexible enough for more complex use cases, even handling the complex pipelines of AutoSklearn and AutoPyTorch, large scale hyperparameter spaces over which to optimize entire pipelines at a time.


This requires ConfigSpace which can be installed with:

pip install "amltk[configspace]"

# Or directly
pip install ConfigSpace

In general, you should have the ConfigSpace documentation ready to consult for a full understanding of how to construct hyperparameter spaces with AMLTK.

Basic Usage#

You can directly us the parser() function and pass that into the search_space() method of a Node, however you can also simply provide search_space(parser="configspace", ...) for simplicity.

from amltk.pipeline import Component, Choice, Sequential
from sklearn.decomposition import PCA
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.svm import SVC

my_pipeline = (
    >> Component(PCA, space={"n_components": (1, 3)})
    >> Choice(
            space={"C": (0.1, 10.0)}
            space={"n_estimators": (10, 100), "criterion": ["gini", "log_loss"]},
                "activation": ["identity", "logistic", "relu"],
                "alpha": (0.0001, 0.1),
                "learning_rate": ["constant", "invscaling", "adaptive"],

space = my_pipeline.search_space("configspace")
Here we have an example of a few different kinds of hyperparmeters,

  • PCA:n_components is a integer with a range of 1 to 3, uniform distribution, as specified by it's integer bounds in a tuple.
  • SVC:C is a float with a range of 0.1 to 10.0, uniform distribution, as specified by it's float bounds in a tuple.
  • RandomForestClassifier:criterion is a categorical hyperparameter, with two choices, "gini" and "log_loss".

There is also a Choice node, which is a special node that indicates that we could choose from one of these estimators. This leads to the conditionals that you can see in the printed out space.

You may wish to remove all conditionals if an Optimizer does not support them, or you may wish to remove them for other reasons. You can do this by passing conditionals=False to the parser() function.

print(my_pipeline.search_space("configspace", conditionals=False))
Likewise, you can also remove all heirarchy from the space which may make downstream tasks easier, by passing flat=True to the parser() function.

print(my_pipeline.search_space("configspace", flat=True))
More Specific Hyperparameters#

You'll often want to be a bit more specific with your hyperparameters, here we just show a few examples of how you'd couple your pipelines a bit more towards ConfigSpace.

from ConfigSpace import Float, Categorical, Normal
from amltk.pipeline import Searchable

s = Searchable(
        "lr": Float("lr", bounds=(1e-5, 1.), log=True, default=0.3),
        "balance": Float("balance", bounds=(-1.0, 1.0), distribution=Normal(0.0, 0.5)),
        "color": Categorical("color", ["red", "green", "blue"], weights=[2, 1, 1], default="blue"),
Conditional ands Advanced Usage#

We will refer you to the ConfigSpace documentation for the construction of these. However once you've constructed a ConfigurationSpace and added any forbiddens and conditionals, you may simply set that as the .space attribute.

from amltk.pipeline import Component, Choice, Sequential
from ConfigSpace import ConfigurationSpace, EqualsCondition, InCondition

myspace = ConfigurationSpace({"A": ["red", "green", "blue"], "B": (1, 10), "C": (-100.0, 0.0)})
    EqualsCondition(myspace["B"], myspace["A"], "red"),  # B is active when A is red
    InCondition(myspace["C"], myspace["A"], ["green", "blue"]), # C is active when A is green or blue

component = Component(object, space=myspace, name="MyThing")

parsed_space = component.search_space("configspace")
parser #

    node: Node,
    seed: int | None = None,
    flat: bool = False,
    conditionals: bool = True,
    delim: str = ":"
) -> ConfigurationSpace

Parse a Node and its children into a ConfigurationSpace.


The Node to parse

TYPE: Node


The seed to use for the ConfigurationSpace

TYPE: int | None DEFAULT: None


Whether to have a heirarchical naming scheme for nodes and their children.

TYPE: bool DEFAULT: False


Whether to include conditionals in the space from a Choice. If this is False, this will also remove all forbidden clauses and other conditional clauses. The primary use of this functionality is that some optimizers do not support these features.

TYPE: bool DEFAULT: True


The delimiter to use for the names of the hyperparameters

TYPE: str DEFAULT: ':'

