Quickstart

A ConfigurationSpace is a data structure to describe the configuration space of an algorithm to tune. Possible hyperparameter types are numerical, categorical, conditional and ordinal hyperparameters.

AutoML tools, such as SMAC3 and BOHB are using the configuration space module to sample hyperparameter configurations. Also, auto-sklearn, an automated machine learning toolkit, which frees the machine learning user from algorithm selection and hyperparameter tuning, makes heavy use of the ConfigSpace package.

This simple quickstart tutorial will show you, how to set up your own ConfigurationSpace, and will demonstrate what you can realize with it. This Basic Usage will include the following:

The Advanced Usage will cover:

  • Creating two sets of possible model configs, using Conditions

  • Create two subspaces from these and add them to a parent ConfigurationSpace

  • Turn these configs into actual models!

These will not show the following and you should refer to the user guide for more:

Basic Usage

We take a look at a simple ridge regression, which has only one floating hyperparameter \(\alpha\).

The first step is always to create a ConfigurationSpace object. All the hyperparameters and constraints will be added to this object.

>>> from ConfigSpace import ConfigurationSpace, Float
>>>
>>> cs = ConfigurationSpace(
...     seed=1234,
...     space={ "alpha": (0.0, 1.0) }
... )

The hyperparameter \(\alpha\) is chosen to have floating point values from 0 to 1. For demonstration purpose, we sample a configuration from the ConfigurationSpace object.

>>> config = cs.sample_configuration()
>>> print(config)
Configuration(values={
  'alpha': 0.1915194503788923,
})

You can use this configuration just like you would a regular old python dictionary!

>>> for key, value in config.items():
...     print(key, value)
alpha 0.1915194503788923

And that’s it!

Advanced Usage

Lets create a more complex example where we have two models, model A and model B. Model B is some kernel based algorithm and A just needs a simple float hyperparamter.

We’re going to create a config space that will let us correctly build a randomly selected model.

 class ModelA:

     def __init__(self, alpha: float):
         """
         Parameters
         ----------
         alpha: float
             Some value between 0 and 1
         """
         ...

class ModelB:

    def __init__(self, kernel: str, kernel_floops: int | None = None):
        """
        Parameters
        ----------
        kernel: "rbf" or "flooper"
            If the kernel is set to "flooper", kernel_floops must be set.

        kernel_floops: int | None = None
            Floop factor of the kernel
        """
        ...

First, lets start with building the two individual subspaces where for A, we want to sample alpha from a normal distribution and for B we have the conditioned parameter and we slightly weight one kernel over another.

from ConfigSpace import ConfigSpace, Categorical, Integer, Float, Normal

class ModelA:

    def __init__(self, alpha: float):
        ...

    @staticmethod
    def space(self) -> ConfigSpace:
        return ConfigurationSpace({
            "alpha": Float("alpha", bounds=(0, 1), distribution=Normal(mu=0.5, sigma=0.2)
        })

class ModelB:

    def __init__(self, kernel: str, kernel_floops: int | None = None):
        ...

    @staticmethod
    def space(self) -> ConfigSpace:
        cs = ConfigurationSpace(
            {
                "kernel": Categorical("kernel", ["rbf", "flooper"], default="rbf", weights=[.75, .25]),
                "kernel_floops": Integer("kernel_floops", bounds=(1, 10)),
            }
        )

        # We have to make sure "kernel_floops" is only active when the kernel is "floops"
        cs.add_condition(EqualsCondition(cs_B["kernel_floops"], cs_B["kernel"], "flooper"))

        return cs

Finally, we need add these two a parent space where we condition each subspace to only be active depending on a parent. We’ll have the default configuration be A but we put more emphasis when sampling on B

cs = ConfigurationSpace(
    seed=1234,
    space={
        "model": Categorical("model", ["A", "B"], default="A", weights=[1, 2]),
    }
)

# We set the prefix and delimiter to be empty string "" so that we don't have to do
# any extra parsing once sampling
cs.add_configuration_space(
    prefix="",
    delimiter="",
    configuration_space=ModelA.space(),
    parent_hyperparameter={"parent": cs["model"], "value": "A"},
)

cs.add_configuration_space(
    prefix="",
    delimiter="",
    configuration_space=modelB.space(),
    parent_hyperparameter={"parent": cs["model"], "value": "B"}
)

And that’s it!

However for completness, lets examine how this works by first sampling from our config space.

configs = cs.sample_configuration(4)
print(configs)

# [Configuration(values={
#  'model': 'A',
#  'alpha': 0.7799758081188035,
# })
# , Configuration(values={
#   'model': 'B',
#   'kernel': 'flooper',
#   'kernel_floops': 8,
# })
# , Configuration(values={
#   'model': 'B',
#   'kernel': 'rbf',
# })
# , Configuration(values={
#   'model': 'B',
#   'kernel': 'rbf',
# })
# ]

We can see the three different kinds of models we have, our basic A model as well as our B model with the two kernels.

Next, we do some processing of these configs to generate valid params to pass to these models

models = []

for config in configs:
    model_type = config.pop("model")

    model = ModelA(**config) if model_type == "A" else ModelB(**config)

    models.append(model)

To continue reading, visit the user guide section. There are more information about hyperparameters, as well as an introduction to the powerful concepts of Conditions and Forbidden Clauses.