Optuna
Optuna parser for parsing out a
search_space()
.
from a pipeline.
Requirements
This requires Optuna
which can be installed with:
Limitations
Optuna feature a very dynamic search space (define-by-run), where people typically sample from some trial object and use traditional python control flow to define conditionality.
This means we can not trivially represent this conditionality in a static search space. While band-aids are possible, it naturally does not sit well with the static output of a parser.
As such, our parser does not support conditionals or choices!. Users may still use the define-by-run within their optimization function itself.
If you have experience with Optuna and have any suggestions, please feel free to open an issue or PR on GitHub!
Usage#
The typical way to represent a search space for Optuna is just to use a dictionary,
where the keys are the names of the hyperparameters and the values are either
integer/float tuples indicating boundaries or some discrete set of values.
It is possible to have the value directly be a
BaseDistribution
, an optuna type, when you need to customize the distribution more.
from amltk.pipeline import Component
from optuna.distributions import FloatDistribution
c = Component(
object,
space={
"myint": (1, 10),
"myfloat": (1.0, 10.0),
"mycategorical": ["a", "b", "c"],
"log-scale-custom": FloatDistribution(1e-10, 1e-2, log=True),
},
name="name",
)
space = c.search_space(parser="optuna")
{
'name:myint': IntDistribution(high=10, log=False, low=1, step=1),
'name:myfloat': FloatDistribution(high=10.0, log=False, low=1.0, step=None),
'name:mycategorical': CategoricalDistribution(choices=('a', 'b', 'c')),
'name:log-scale-custom': FloatDistribution(high=0.01, log=True, low=1e-10,
step=None)
}
You may also just pass the parser=
function directly if preferred
from amltk.pipeline.parsers.optuna import parser as optuna_parser
space = c.search_space(parser=optuna_parser)
{
'name:myint': IntDistribution(high=10, log=False, low=1, step=1),
'name:myfloat': FloatDistribution(high=10.0, log=False, low=1.0, step=None),
'name:mycategorical': CategoricalDistribution(choices=('a', 'b', 'c')),
'name:log-scale-custom': FloatDistribution(high=0.01, log=True, low=1e-10,
step=None)
}
When using search_space()
on a some nested
structures, you may want to flatten the names of the hyperparameters. For this you
can use flat=
from amltk.pipeline import Searchable, Sequential
seq = Sequential(
Searchable({"myint": (1, 10)}, name="nested_1"),
Searchable({"myfloat": (1.0, 10.0)}, name="nested_2"),
name="seq"
)
hierarchical_space = seq.search_space(parser="optuna", flat=False) # Default
flat_space = seq.search_space(parser="optuna", flat=False) # Default
{
'seq:nested_1:myint': IntDistribution(high=10, log=False, low=1, step=1),
'seq:nested_2:myfloat': FloatDistribution(high=10.0, log=False, low=1.0,
step=None)
}
{
'seq:nested_1:myint': IntDistribution(high=10, log=False, low=1, step=1),
'seq:nested_2:myfloat': FloatDistribution(high=10.0, log=False, low=1.0,
step=None)
}
def parser(node, *, flat=False, conditionals=False, delim=':')
#
Parse a Node and its children into a ConfigurationSpace.
PARAMETER | DESCRIPTION |
---|---|
node |
The Node to parse
TYPE:
|
flat |
Whether to have a hierarchical naming scheme for nodes and their children.
TYPE:
|
conditionals |
Whether to include conditionals in the space from a
Not yet supported This functionality is not yet supported as we can't encode this into a static Optuna search space.
TYPE:
|
delim |
The delimiter to use for the names of the hyperparameters.
TYPE:
|