Quickstart
Quickstart#
A ConfigurationSpace is a data structure to describe the configuration space of an algorithm to tune. Possible hyperparameter types are numerical, categorical, conditional and ordinal hyperparameters.
AutoML tools, such as SMAC3
and BOHB
are using the configuration space
module to sample hyperparameter configurations.
Also, auto-sklearn
, an automated machine learning toolkit, which frees the
machine learning user from algorithm selection and hyperparameter tuning,
makes heavy use of the ConfigSpace package.
This simple quickstart tutorial will show you, how to set up your own ConfigurationSpace, and will demonstrate what you can realize with it. This Basic Usage will include the following:
- Create a ConfigurationSpace
- Define a simple hyperparameter with a float value
The Advanced Usage will cover:
- Creating two sets of possible model configs, using Conditions.
- Use a different distirbution for one of the hyperparameters.
- Create two subspaces from these and add them to a parent ConfigurationSpace
- Turn these configs into actual models!
These will not show the following and you should refer to the user guide for more:
- Add Forbidden clauses
- Add Conditions
- Serialize
Basic Usage#
We take a look at a simple
ridge regression,
which has only one floating hyperparameter alpha
.
The first step is always to create a ConfigurationSpace object. All the hyperparameters and constraints will be added to this object.
from ConfigSpace import ConfigurationSpace, Float
cs = ConfigurationSpace(space={"alpha": (0.0, 1.0)}, seed=1234)
print(cs)
The hyperparameter alpha
is chosen to have floating point values from 0
to 1
.
For demonstration purpose, we sample a configuration from the ConfigurationSpace object.
You can use this configuration just like you would a regular old python dictionary!
And that's it!
Advanced Usage#
Lets create a more complex example where we have two models, model A
and model B
.
Model B
is some kernel based algorithm and A
just needs a simple float hyperparamter.
We're going to create a config space that will let us correctly build a randomly selected model.
from typing import Literal
from dataclasses import dataclass
@dataclass
class ModelA:
alpha: float
"""Some value between 0 and 1"""
@dataclass
class ModelB:
kernel: Literal["rbf", "flooper"]
"""Kernel type."""
kernel_floops: int | None = None
"""Number of floops for the flooper kernel, only used if kernel == "flooper"."""
First, lets start with building the two individual subspaces where for A
, we want to sample alpha from a normal distribution and for B
we have the conditioned parameter and we slightly weight one kernel over another.
from typing import Literal
from ConfigSpace import ConfigurationSpace, Categorical, Integer, Float, Normal, EqualsCondition
@dataclass
class ModelA:
alpha: float
"""Some value between 0 and 1"""
@staticmethod
def space() -> ConfigurationSpace:
return ConfigurationSpace({
"alpha": Float("alpha", bounds=(0, 1), distribution=Normal(mu=0.5, sigma=0.2))
})
@dataclass
class ModelB:
kernel: Literal["rbf", "flooper"]
"""Kernel type."""
kernel_floops: int | None = None
"""Number of floops for the flooper kernel, only used if kernel == "flooper"."""
@staticmethod
def space() -> ConfigurationSpace:
cs = ConfigurationSpace(
{
"kernel": Categorical("kernel", ["rbf", "flooper"], default="rbf", weights=[.75, .25]),
"kernel_floops": Integer("kernel_floops", bounds=(1, 10)),
}
)
# We have to make sure "kernel_floops" is only active when the kernel is "floops"
cs.add(EqualsCondition(cs["kernel_floops"], cs["kernel"], "flooper"))
return cs
Finally, we need add these two a parent space where we condition each subspace to only be active depending on a parent.
We'll have the default configuration be A
but we put more emphasis when sampling on B
from ConfigSpace import ConfigurationSpace, Categorical
cs = ConfigurationSpace(
seed=123456,
space={
"model": Categorical("model", ["A", "B"], default="A", weights=[1, 2]),
}
)
# We set the prefix and delimiter to be empty string "" so that we don't have to do
# any extra parsing once sampling
cs.add_configuration_space(
prefix="",
delimiter="",
configuration_space=ModelA.space(),
parent_hyperparameter={"parent": cs["model"], "value": "A"},
)
cs.add_configuration_space(
prefix="",
delimiter="",
configuration_space=ModelB.space(),
parent_hyperparameter={"parent": cs["model"], "value": "B"}
)
print(cs)
Configuration space object:
Hyperparameters:
alpha, Type: NormalFloat, Mu: 0.5, Sigma: 0.2, Range: [0.0, 1.0], Default: 0.5
kernel, Type: Categorical, Choices: {rbf, flooper}, Default: rbf, Probabilities: [0.75 0.25]
kernel_floops, Type: UniformInteger, Range: [1, 10], Default: 6
model, Type: Categorical, Choices: {A, B}, Default: A, Probabilities: [0.33333333 0.66666667]
Conditions:
alpha | model == 'A'
kernel | model == 'B'
kernel_floops | kernel == 'flooper'
And that's it!
However for completness, lets examine how this works by first sampling from our config space.
[Configuration(values={
'model': np.str_('A'),
'alpha': 0.4165543657281,
}), Configuration(values={
'model': np.str_('B'),
'kernel': np.str_('rbf'),
}), Configuration(values={
'model': np.str_('A'),
'alpha': 0.6956537933613,
}), Configuration(values={
'model': np.str_('B'),
'kernel': np.str_('flooper'),
'kernel_floops': 5,
})]
We can see the three different kinds of models we have, our basic A
model as well as our B
model
with the two kernels.
Next, we do some processing of these configs to generate valid params to pass to these models
models = []
for config in configs:
config_as_dict = dict(config)
model_type = config_as_dict.pop("model")
model = ModelA(**config_as_dict) if model_type == "A" else ModelB(**config_as_dict)
models.append(model)
print(models)
To continue reading, visit the user guide section. There are more information about hyperparameters, as well as an introduction to the powerful concepts of Conditions and Forbidden clauses.