Declarative Usage
Introduction#
Configuring with YAML#
Configure your experiments using a YAML file, which serves as a central reference for setting up your project. This approach simplifies sharing, reproducing and modifying configurations.
Argument Handling and Prioritization
You can partially define and provide arguments via run_args
(YAML file) and partially provide the arguments
directly to neps.run
. Arguments directly provided to neps.run
get prioritized over those defined in the YAML file. An exception to this
is for searcher_kwargs
where a merge happens between the configurations. In this case, the directly provided arguments
are still prioritized, but the values from both the directly provided arguments and the YAML file are merged.
Simple YAML Example#
Below is a straightforward YAML configuration example for NePS covering the required arguments.
# Basic NePS Configuration Example
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
import neps
def run_pipeline(learning_rate, optimizer, epochs):
model = initialize_model()
training_loss = train_model(model, optimizer, learning_rate, epochs)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
if __name__ == "__main__":
neps.run(run_pipeline, run_args="path/to/your/config.yaml")
Including run_pipeline
in run_args
for External Referencing#
In addition to setting experimental parameters via YAML, this configuration example also specifies the pipeline function and its location, enabling more flexible project structures.
# Simple NePS configuration including run_pipeline
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
Comprehensive YAML Configuration Template#
This example showcases a more comprehensive YAML configuration, which includes not only the essential parameters but also advanced settings for more complex setups.
# Full Configuration Template for NePS
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
max_cost_total:
# Debug and Monitoring
overwrite_working_directory: true
post_run_summary: false
development_stage_id:
task_id:
# Parallelization Setup
max_evaluations_per_run:
continue_until_max_evaluation_completed: false
# Error Handling
loss_value_on_error:
cost_value_on_error:
ignore_errors:
# Customization Options
searcher: hyperband # Internal key to select a NePS optimizer.
# Hooks
pre_load_hooks:
The searcher
key used in the YAML configuration corresponds to the same keys used for selecting an optimizer directly
through neps.run
. For a detailed list of integrated optimizers, see here
Note on undefined keys in run_args
(config.yaml)
Not all configurations are explicitly defined in this template. Any undefined key in the YAML file is mapped to the internal default settings of NePS. This ensures that your experiments can run even if certain parameters are omitted.
Different Use Cases#
Customizing NePS optimizer#
Customize an internal NePS optimizer by specifying its parameters directly under the key searcher
in the
config.yaml
file.
Note
For searcher_kwargs
of neps.run
, the optimizer arguments passed via the YAML file and those passed directly via
neps.run
will be merged. In this special case, if the same argument is referenced in both places,
searcher_kwargs
will be prioritized and set for this argument.
# Customizing NePS Searcher
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher:
strategy: bayesian_optimization # key for neps searcher
name: "my_bayesian" # optional; changing the searcher_name for better recognition
# Specific arguments depending on the searcher
initial_design_size: 7
For detailed information about the available optimizers and their parameters, please visit the optimizer page
Testing Multiple Optimizer Configurations#
Simplify experiments with multiple optimizer settings by outsourcing the optimizer configuration.
# Optimizer settings from YAML configuration
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher: path/to/your/searcher_setup.yaml
Handling Large Search Spaces#
Manage large search spaces by outsourcing the pipeline space configuration in a separate YAML file or for keeping track of your experiments.
# Pipeline space settings from separate YAML
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space: path/to/your/pipeline_space.yaml
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
# Pipeline_space including priors and fidelity
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
default: 1e-2
default_confidence: "medium"
epochs:
lower: 5
upper: 20
is_fidelity: true
dropout_rate:
lower: 0.1
upper: 0.5
default: 0.2
default_confidence: "high"
optimizer:
choices: [adam, sgd, adamw]
default: adam
# if default confidence is not defined it gets its default 'low'
batch_size: 64
def example_pipeline(learning_rate, optimizer, epochs, batch_size, dropout_rate):
model = initialize_model(dropout_rate)
training_loss = train_model(model, optimizer, learning_rate, epochs, batch_size)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
Using Architecture Search Spaces#
Since the option for defining the search space via YAML is limited to HPO, grammar-based search spaces or architecture
search spaces must be loaded via a dictionary, which is then referenced in the config.yaml
.
# Loading pipeline space from a python dict
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
path: path/to/your/search_space.py # Path to the dict file
name: pipeline_space # Name of the dict instance
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
from torch import nn
import neps
from neps.search_spaces.architecture import primitives as ops
from neps.search_spaces.architecture import topologies as topos
from neps.search_spaces.architecture.primitives import AbstractPrimitive
class DownSampleBlock(AbstractPrimitive):
def __init__(self, in_channels: int, out_channels: int):
super().__init__(locals())
self.conv_a = ReLUConvBN(
in_channels, out_channels, kernel_size=3, stride=2, padding=1
)
self.conv_b = ReLUConvBN(
out_channels, out_channels, kernel_size=3, stride=1, padding=1
)
self.downsample = nn.Sequential(
nn.AvgPool2d(kernel_size=2, stride=2, padding=0),
nn.Conv2d(
in_channels, out_channels, kernel_size=1, stride=1, padding=0, bias=False
),
)
def forward(self, inputs):
basicblock = self.conv_a(inputs)
basicblock = self.conv_b(basicblock)
residual = self.downsample(inputs)
return residual + basicblock
class ReLUConvBN(AbstractPrimitive):
def __init__(self, in_channels, out_channels, kernel_size, stride, padding):
super().__init__(locals())
self.kernel_size = kernel_size
self.op = nn.Sequential(
nn.ReLU(inplace=False),
nn.Conv2d(
in_channels,
out_channels,
kernel_size,
stride=stride,
padding=padding,
dilation=1,
bias=False,
),
nn.BatchNorm2d(out_channels, affine=True, track_running_stats=True),
)
def forward(self, x):
return self.op(x)
class AvgPool(AbstractPrimitive):
def __init__(self, **kwargs):
super().__init__(kwargs)
self.op = nn.AvgPool2d(3, stride=1, padding=1, count_include_pad=False)
def forward(self, x):
return self.op(x)
primitives = {
"Sequential15": topos.get_sequential_n_edge(15),
"DenseCell": topos.get_dense_n_node_dag(4),
"down": {"op": DownSampleBlock},
"avg_pool": {"op": AvgPool},
"id": {"op": ops.Identity},
"conv3x3": {"op": ReLUConvBN, "kernel_size": 3, "stride": 1, "padding": 1},
"conv1x1": {"op": ReLUConvBN, "kernel_size": 1, "stride": 1, "padding": 0},
}
structure = {
"S": ["Sequential15(C, C, C, C, C, down, C, C, C, C, C, down, C, C, C, C, C)"],
"C": ["DenseCell(OPS, OPS, OPS, OPS, OPS, OPS)"],
"OPS": ["id", "conv3x3", "conv1x1", "avg_pool"],
}
def set_recursive_attribute(op_name, predecessor_values):
in_channels = 16 if predecessor_values is None else predecessor_values["out_channels"]
out_channels = in_channels * 2 if op_name == "DownSampleBlock" else in_channels
return dict(in_channels=in_channels, out_channels=out_channels)
pipeline_space = dict(
architecture=neps.Architecture(
set_recursive_attribute=set_recursive_attribute,
structure=structure,
primitives=primitives,
),
optimizer=neps.Categorical(choices=["sgd", "adam"]),
learning_rate=neps.Float(lower=10e-7, upper=10e-3, log=True),
)
from torch import nn
def example_pipeline(architecture, optimizer, learning_rate):
in_channels = 3
base_channels = 16
n_classes = 10
out_channels_factor = 4
# E.g., in shape = (N, 3, 32, 32) => out shape = (N, 10)
model = architecture.to_pytorch()
model = nn.Sequential(
nn.Conv2d(in_channels, base_channels, 3, padding=1, bias=False),
nn.BatchNorm2d(base_channels),
model,
nn.BatchNorm2d(base_channels * out_channels_factor),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d(1),
nn.Flatten(),
nn.Linear(base_channels * out_channels_factor, n_classes),
)
training_loss = train_model(model, optimizer, learning_rate)
evaluation_loss = evaluate_model(model)
return {"loss": evaluation_loss, "training_loss": training_loss}
Integrating Custom Optimizers#
For people who want to write their own optimizer class as a subclass of the base optimizer, you can load your own
custom optimizer class and define its arguments in config.yaml
.
Note: You can still overwrite arguments via searcher_kwargs of neps.run
like for the internal searchers.
# Loading Optimizer Class
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
optimizer:
choices: [adam, sgd, adamw]
epochs: 50
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
searcher:
path: path/to/your/searcher.py # Path to the class
name: CustomOptimizer # class name within the file
# Specific arguments depending on your searcher
initial_design_size: 7
Adding Custom Hooks to Your Configuration#
Define hooks in your YAML configuration to extend the functionality of your experiment.
# Hooks
run_pipeline:
path: path/to/your/run_pipeline.py # Path to the function file
name: example_pipeline # Function name within the file
pipeline_space:
learning_rate:
lower: 1e-5
upper: 1e-1
log: true # Log scale for learning rate
epochs:
lower: 5
upper: 20
is_fidelity: true
optimizer:
choices: [adam, sgd, adamw]
batch_size: 64
root_directory: path/to/results # Directory for result storage
max_evaluations_total: 20 # Budget
pre_load_hooks:
hook1: path/to/your/hooks.py # (function_name: Path to the function's file)
hook2: path/to/your/hooks.py # Different function name 'hook2' from the same file source
CLI Usage#
This section provides a brief overview of the primary commands available in the NePS CLI. For additional command options, you can directly refer to the help documentation provided by each command using --help.
init
Command#
Generates a default run_args
YAML configuration file, providing a template that you can customize for your experiments.
Options:
--config-path <path>
: Optional. Specify the custom path for generating the configuration file. Defaults torun_config.yaml
in the current working directory.--template [basic|complete]
: Optional. Choose between a basic or complete template. The basic template includes only required settings, while the complete template includes all NePS configurations.--state-machine
: Optional. Creates a NEPS state if set, which requires an existingrun_config.yaml
.
Example Usage:
run
Command#
Executes the optimization based on the provided configuration. This command serves as a CLI wrapper around neps.run
, effectively mapping each CLI argument to a parameter in neps.run
. It offers a flexible interface that allows you to override the existing settings specified in the YAML configuration file, facilitating dynamic adjustments for managing your experiments.
Options:
--run-args <path>
: Path to the YAML configuration file containing the run arguments.--run-pipeline <path/to/module:function_name>
: Optional. Specify the path to the Python module and function to use for running the pipeline. Overrides any settings in the YAML file.--pipeline-space <path/to/yaml>
: Path to the YAML file defining the search space for the optimization.--root-directory <path>
: Optional. Directory for saving progress and synchronizing multiple processes. Defaults to theroot_directory
fromrun_config.yaml
if not provided.--overwrite-working-directory
: Optional. If set, deletes the working directory at the start of the run.--development-stage-id <id>
: Optional. Identifier for the current development stage, useful for multi-stage projects.--task-id <id>
: Optional. Identifier for the current task, useful for managing projects with multiple tasks.--post-run-summary/--no-post-run-summary
: Optional. Provides a summary of the run after execution. Enabled by default.--max-evaluations-total <int>
: Optional. Specifies the total number of evaluations to run.--max-evaluations-per-run <int>
: Optional. Number of evaluations to run per call.--continue-until-max-evaluation-completed
: Optional. If set, ensures the run continues untilmax-evaluations-total
has been reached.--max-cost-total <float>
: Optional. Specifies a cost threshold. No new evaluations will start if this cost is exceeded.--ignore-errors
: Optional. If set, errors during the optimization will be ignored.--loss-value-on-error <float>
: Optional. Specifies the loss value to assume in case of an error.--cost-value-on-error <float>
: Optional. Specifies the cost value to assume in case of an error.--searcher <key>
: Specifies the searcher algorithm for optimization.--searcher-kwargs <key=value>
: Optional. Additional keyword arguments for the searcher.
Example Usage:
status
Command#
Executes the optimization based on the provided configuration. This command serves as a CLI wrapper around neps.run, effectively mapping each CLI argument to a parameter in neps.run. This setup offers a flexible interface that allows you to override the existing settings specified in the YAML configuration file, facilitating dynamic adjustments for managing your experiments.
Example Usage: