The run function#
Introduction#
The run_pipeline=
function is crucial for NePS. It encapsulates the objective function to be minimized, which could range from a regular equation to a full training and evaluation pipeline for a neural network.
This function receives the configuration to be utilized from the parameters defined in the search space. Consequently, it executes the same set of instructions or equations based on the provided configuration to minimize the objective function.
We will show some basic usages and some functionalites this function would require for successful implementation.
Types of Returns#
1. Single Value#
Assuming the pipeline_space=
was already created (have a look at pipeline space for more details).
A run_pipeline=
function with an objective of minimizing the loss will resemble the following:
def run_pipeline(
**config, # The hyperparameters to be used in the pipeline
):
element_1 = config["element_1"]
element_2 = config["element_2"]
element_3 = config["element_3"]
loss = element_1 - element_2 + element_3
return loss
2. Dictionary#
In this section, we will outline the special variables that are expected to be returned when the run_pipeline=
function returns a dictionary.
Loss#
One crucial return variable is the loss
. This metric serves as a fundamental indicator for the optimizer. One option is to return a dictionary with the loss
as a key, along with other user-chosen metrics.
Note
Loss can be any value that is to be minimized by the objective function.
def run_pipeline(
**config, # The hyperparameters to be used in the pipeline
):
element_1 = config["element_1"]
element_2 = config["element_2"]
element_3 = config["element_3"]
loss = element_1 - element_2 + element_3
reverse_loss = -loss
return {
"loss": loss,
"info_dict": {
"reverse_loss": reverse_loss
...
}
}
Cost#
Along with the return of the loss
, the run_pipeline=
function would optionally need to return a cost
in certain cases. Specifically when the max_cost_total
parameter is being utilized in the neps.run
function.
Note
max_cost_total
sums the cost from all returned configuration results and checks whether the maximum allowed cost has been reached (if so, the search will come to an end).
import neps
import logging
def run_pipeline(
**config, # The hyperparameters to be used in the pipeline
):
element_1 = config["element_1"]
element_2 = config["element_2"]
element_3 = config["element_3"]
loss = element_1 - element_2 + element_3
cost = 2
return {
"loss": loss,
"cost": cost,
}
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
neps.run(
run_pipeline=run_pipeline,
pipeline_space=pipeline_space, # Assuming the pipeline space is defined
root_directory="results/bo",
max_cost_total=10,
searcher="bayesian_optimization",
)
Each evaluation carries a cost of 2. Hence in this example, the Bayesian optimization search is set to perform 5 evaluations.
Arguments for Convenience#
NePS also provides the pipeline_directory
and the previous_pipeline_directory
as arguments in the run_pipeline=
function for user convenience.
Regard an example to be run with a multi-fidelity searcher, some checkpointing would be advantageos such that one does not have to train the configuration from scratch when the configuration qualifies to higher fidelity brackets.
def run_pipeline(
pipeline_directory, # The directory where the config is saved
previous_pipeline_directory, # The directory of the immediate lower fidelity config
**config, # The hyperparameters to be used in the pipeline
):
# Assume element3 is our fidelity element
element_1 = config["element_1"]
element_2 = config["element_2"]
element_3 = config["element_3"]
# Load any saved checkpoints
checkpoint_name = "checkpoint.pth"
start_element_3 = 0
if previous_pipeline_directory is not None:
# Read in state of the model after the previous fidelity rung
checkpoint = torch.load(previous_pipeline_directory / checkpoint_name)
prev_element_3 = checkpoint["element_3"]
else:
prev_element_3 = 0
start_element_3 += prev_element_3
loss = 0
for i in range(start_element_3, element_3):
loss += element_1 - element_2
torch.save(
{
"element_3": element_3,
},
pipeline_directory / checkpoint_name,
)
return loss
This could allow the proper navigation to the trained models and further train them on higher fidelities without repeating the entire training process.