arlbench.autorl.objectives¶
This module contains the objectives for the AutoRL environment.
Functions
|
Compute discounted rewards. |
Classes
|
Discounted reward objective for the AutoRL environment. |
|
Discounted reward objective for the AutoRL environment. |
|
Emissions objective for the AutoRL environment. |
|
An abstract optimization objective for the AutoRL environment. |
|
Reward objective for the AutoRL environment. |
|
Runtime objective for the AutoRL environment. |
|
Reward objective for the AutoRL environment. |
- class arlbench.autorl.objectives.DiscountedReward(*args, **kwargs)[source]¶
Bases:
ObjectiveDiscounted reward objective for the AutoRL environment. It measures the last discounted evaluation rewards.
- static __call__(train_func, objectives, optimize_objectives, gamma=0.99, default_arg='mean')[source]¶
Wraps the training function with the reward mean calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]
- class arlbench.autorl.objectives.DiscountedTrainReward(*args, **kwargs)[source]¶
Bases:
ObjectiveDiscounted reward objective for the AutoRL environment. It measures the mean of the last discounted training rewards.
- static __call__(train_func, objectives, optimize_objectives, gamma=0.99, default_arg='mean')[source]¶
Wraps the training function with the reward mean calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]
- class arlbench.autorl.objectives.Emissions(*args, **kwargs)[source]¶
Bases:
ObjectiveEmissions objective for the AutoRL environment. It measures the emissions during the training using code carbon.
- static __call__(train_func, objectives, optimize_objectives)[source]¶
Wraps the training function with the emissions calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]
- class arlbench.autorl.objectives.Objective(*args, **kwargs)[source]¶
Bases:
ABCAn abstract optimization objective for the AutoRL environment.
It can be wrapped around the training function to calculate the objective. We do this be overriding the __new__() function. It allows us to imitate the behaviour of a basic function while keeping the advantages of a static class.
- abstract static __call__(train_func, objectives, optimize_objectives)[source]¶
Wraps the training function with the objective calculation.
- Parameters:
train_func (TrainFunc) – Training function to wrap.
objectives (dict) – Dictionary to store objective.
optimize_objectives (str) – Whether to minimize/maximize the objectve.
- Returns:
Training function.
- Return type:
TrainFunc
- __lt__(other)[source]¶
Implements “less-than” comparison between two objectives. Used for sorting based on objective rank.
- Parameters:
other (Objective) – Other Objective to compare to.
- Returns:
Whether this Objective is less than the other Objective.
- Return type:
bool
- class arlbench.autorl.objectives.Reward(*args, **kwargs)[source]¶
Bases:
ObjectiveReward objective for the AutoRL environment. And applies an aggregation function.
- static __call__(train_func, objectives, optimize_objectives, default_arg='mean')[source]¶
Wraps the training function with the reward mean calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]
- class arlbench.autorl.objectives.Runtime(*args, **kwargs)[source]¶
Bases:
ObjectiveRuntime objective for the AutoRL environment. It measures the total training runtime.
- static __call__(train_func, objectives, optimize_objectives)[source]¶
Wraps the training function with the runtime calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]
- class arlbench.autorl.objectives.TrainReward(*args, **kwargs)[source]¶
Bases:
ObjectiveReward objective for the AutoRL environment. It measures the mean of the last training rewards.
- static __call__(train_func, objectives, optimize_objectives, default_arg='mean')[source]¶
Wraps the training function with the reward mean calculation.
- Return type:
Callable[[DQNRunnerState|PPORunnerState|SACRunnerState,PrioritisedTrajectoryBufferState,int|None,int|None,int|None],tuple[DQNState,DQNTrainingResult] |tuple[PPOState,PPOTrainingResult] |tuple[SACState,SACTrainingResult]]