Declarative Usage
Introduction#
Configuring with YAML#
Configure your experiments using a YAML file, which serves as a central reference for setting up your project. This approach simplifies sharing, reproducing and modifying configurations.
Note
You can partially define arguments in the YAML file and partially provide the arguments directly to neps.run
.
However, double referencing is not allowed. You cannot define the same argument in both places.
Simple YAML Example#
Below is a straightforward YAML configuration example for NePS covering the required arguments.
# Basic NePS Configuration Example
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
import neps
def run_pipeline(learning_rate, optimizer, epochs):
model = initialize_model()
training_loss = train_model(model, optimizer, learning_rate, epochs)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
if __name__ == "__main__":
neps.run(run_pipeline, run_args="path/to/your/config.yaml")
Including run_pipeline
in run_args
for External Referencing#
In addition to setting experimental parameters via YAML, this configuration example also specifies the pipeline function and its location, enabling more flexible project structures.
# Simple NePS configuration including run_pipeline
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
Comprehensive YAML Configuration Template#
This example showcases a more comprehensive YAML configuration, which includes not only the essential parameters but also advanced settings for more complex setups.
# Full Configuration Template for NePS
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
max_cost_total:
# Debug and Monitoring
overwrite_working_directory: true
post_run_summary: false
development_stage_id:
task_id:
# Parallelization Setup
max_evaluations_per_run:
continue_until_max_evaluation_completed: false
# Error Handling
loss_value_on_error:
cost_value_on_error:
ignore_errors:
# Customization Options
searcher: hyperband # Internal key to select a NePS optimizer.
# Hooks
pre_load_hooks:
The searcher
key used in the YAML configuration corresponds to the same keys used for selecting an optimizer directly
through neps.run
. For a detailed list of integrated optimizers, see here
Note on undefined keys in run_args
(config.yaml)
Not all configurations are explicitly defined in this template. Any undefined key in the YAML file is mapped to the internal default settings of NePS. This ensures that your experiments can run even if certain parameters are omitted.
Different Use Cases#
Customizing NePS optimizer#
Customize an internal NePS optimizer by specifying its parameters directly under the key searcher
in the
config.yaml
file.
Note
For searcher_kwargs
of neps.run
, the optimizer arguments passed via the YAML file and those passed directly via
neps.run
will be merged. In this special case, if the same argument is referenced in both places,
searcher_kwargs
will be prioritized and set for this argument.
# Customizing NePS Searcher
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher:
strategy: bayesian_optimization # key for neps searcher
name: "my_bayesian" # optional; changing the searcher_name for better recognition
# Specific arguments depending on the searcher
initial_design_size: 7
surrogate_model: gp
acquisition: EI
acquisition_sampler: random
random_interleave_prob: 0.1
For detailed information about the available optimizers and their parameters, please visit the optimizer page
Testing Multiple Optimizer Configurations#
Simplify experiments with multiple optimizer settings by outsourcing the optimizer configuration.
# Optimizer settings from YAML configuration
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher: path/to/your/searcher_setup.yaml
Handling Large Search Spaces#
Manage large search spaces by outsourcing the pipeline space configuration in a separate YAML file or for keeping track of your experiments.
# Pipeline space settings from separate YAML
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space: path/to/your/pipeline_space.yaml
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
# Pipeline_space including priors and fidelity
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
default: 1e-2
default_confidence: "medium"
epochs:
lower: 5
upper: 20
is_fidelity: true
dropout_rate:
lower: 0.1
upper: 0.5
default: 0.2
default_confidence: "high"
optimizer:
choices: [adam, sgd, adamw]
default: adam
# if default confidence is not defined it gets its default 'low'
batch_size: 64
def example_pipeline(learning_rate, optimizer, epochs, batch_size, dropout_rate):
model = initialize_model(dropout_rate)
training_loss = train_model(model, optimizer, learning_rate, epochs, batch_size)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
Using Architecture Search Spaces#
Since the option for defining the search space via YAML is limited to HPO, grammar-based search spaces or architecture
search spaces must be loaded via a dictionary, which is then referenced in the config.yaml
.
# Loading pipeline space from a python dict
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
path: path/to/your/search_space.py # Path to the dict file
name: pipeline_space # Name of the dict instance
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
from __future__ import annotations
from torch import nn
import neps
from neps.search_spaces.architecture import primitives as ops
from neps.search_spaces.architecture import topologies as topos
from neps.search_spaces.architecture.primitives import AbstractPrimitive
class DownSampleBlock(AbstractPrimitive):
def __init__(self, in_channels: int, out_channels: int):
super().__init__(locals())
self.conv_a = ReLUConvBN(
in_channels, out_channels, kernel_size=3, stride=2, padding=1
)
self.conv_b = ReLUConvBN(
out_channels, out_channels, kernel_size=3, stride=1, padding=1
)
self.downsample = nn.Sequential(
nn.AvgPool2d(kernel_size=2, stride=2, padding=0),
nn.Conv2d(
in_channels, out_channels, kernel_size=1, stride=1, padding=0, bias=False
),
)
def forward(self, inputs):
basicblock = self.conv_a(inputs)
basicblock = self.conv_b(basicblock)
residual = self.downsample(inputs)
return residual + basicblock
class ReLUConvBN(AbstractPrimitive):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
super().__init__(locals())
self.kernel_size = kernel_size
self.op = nn.Sequential(
nn.ReLU(inplace=False),
nn.Conv2d(
in_channels,
out_channels,
kernel_size,
stride=stride,
padding=padding,
dilation=1,
bias=False,
),
nn.BatchNorm2d(out_channels, affine=True, track_running_stats=True),
)
def forward(self, x):
return self.op(x)
class AvgPool(AbstractPrimitive):
def __init__(self, **kwargs):
super().__init__(kwargs)
self.op = nn.AvgPool2d(3, stride=1, padding=1, count_include_pad=False)
def forward(self, x):
return self.op(x)
primitives = {
"Sequential15": topos.get_sequential_n_edge(15),
"DenseCell": topos.get_dense_n_node_dag(4),
"down": {"op": DownSampleBlock},
"avg_pool": {"op": AvgPool},
"id": {"op": ops.Identity},
"conv3x3": {"op": ReLUConvBN, "kernel_size": 3, "stride": 1, "padding": 1},
"conv1x1": {"op": ReLUConvBN, "kernel_size": 1, "stride": 1, "padding": 0},
}
structure = {
"S": ["Sequential15(C, C, C, C, C, down, C, C, C, C, C, down, C, C, C, C, C)"],
"C": ["DenseCell(OPS, OPS, OPS, OPS, OPS, OPS)"],
"OPS": ["id", "conv3x3", "conv1x1", "avg_pool"],
}
def set_recursive_attribute(op_name, predecessor_values):
in_channels = 16 if predecessor_values is None else predecessor_values["out_channels"]
out_channels = in_channels * 2 if op_name == "DownSampleBlock" else in_channels
return dict(in_channels=in_channels, out_channels=out_channels)
pipeline_space = dict(
architecture=neps.ArchitectureParameter(
set_recursive_attribute=set_recursive_attribute,
structure=structure,
primitives=primitives,
),
optimizer=neps.CategoricalParameter(choices=["sgd", "adam"]),
learning_rate=neps.FloatParameter(lower=10e-7, upper=10e-3, log=True),
)
from torch import nn
def example_pipeline(architecture, optimizer, learning_rate):
in_channels = 3
base_channels = 16
n_classes = 10
out_channels_factor = 4
# E.g., in shape = (N, 3, 32, 32) => out shape = (N, 10)
model = architecture.to_pytorch()
model = nn.Sequential(
nn.Conv2d(in_channels, base_channels, 3, padding=1, bias=False),
nn.BatchNorm2d(base_channels),
model,
nn.BatchNorm2d(base_channels * out_channels_factor),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d(1),
nn.Flatten(),
nn.Linear(base_channels * out_channels_factor, n_classes),
)
training_loss = train_model(model, optimizer, learning_rate)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
Integrating Custom Optimizers#
For people who want to write their own optimizer class as a subclass of the base optimizer, you can load your own
custom optimizer class and define its arguments in config.yaml
.
Note: You can still overwrite arguments via searcher_kwargs of neps.run
like for the internal searchers.
# Loading Optimizer Class
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher:
path: path/to/your/searcher.py # Path to the class
name: CustomOptimizer # class name within the file
# Specific arguments depending on your searcher
initial_design_size: 7
surrogate_model: gp
acquisition: EI
Adding Custom Hooks to Your Configuration#
Define hooks in your YAML configuration to extend the functionality of your experiment.
# Hooks
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
pre_load_hooks:
hook1: path/to/your/hooks.py # (function_name: Path to the function's file)
hook2: path/to/your/hooks.py # Different function name 'hook2' from the same file source