arlbench.autorl¶

class arlbench.autorl.AutoRLEnv(config=None)[source]¶

Bases: Env

Automated Reinforcement Learning (gynmasium-like) Environment.

With each reset, the algorithm state is (re-)initialized. If a checkpoint path is passed to reset, the agent state is initialized with the checkpointed state.

In each step, one iteration of training is performed with the current hyperparameter configuration (= action).

property action_space: Space¶

Returns the hyperparameter configuration spaces as gymnasium space.

Returns:: Hyperparameter configuration space.
Return type:: gymnasium.spaces.Space

property checkpoints: list[str]¶

Returns a list of created checkpoints for this AutoRL environment.

Returns:: List of checkpoint paths.
Return type:: list[str]

property config: dict¶

Returns the AutoRL configuration.

Returns:: AutoRL configuration.
Return type:: dict

property config_space: ConfigurationSpace¶

Returns the hyperparameter configuration spaces as ConfigSpace.

Returns:: Hyperparameter configuration space.
Return type:: ConfigurationSpace

eval(num_eval_episodes)[source]¶

Evaluates the algorithm using its current training state.

Parameters:: num_eval_episodes (int) – Number of evaluation episodes to run.
Returns:: Array of evaluation return for each episode.
Return type:: np.ndarray

get_algorithm_init_kwargs(init_rng)[source]¶

Returns the algorithm initialization parameters.

Returns:: Dictionary of algorithm initialization parameters.
Return type:: Dict

property hpo_config: Configuration¶

Returns the current hyperparameter configuration stored in the AutoRL environment..

Returns:: Hyperparameter configuration.
Return type:: Configuration

property objectives: list[str]¶

Returns configured objectives.

Returns:: List of objectives.
Return type:: list[str]

property observation_space: Space¶

Returns a gymnasium spaces of state features (observations).

Returns:: Gynasium space.
Return type:: gymnasium.spaces.Space

reset()[source]¶

Resets the AutoRL environment and current algorithm state.

Returns:: Empty observation and state information.
Return type:: tuple[ObservationT, InfoT]

step(action, checkpoint_path=None, n_total_timesteps=None, n_eval_steps=None, n_eval_episodes=None, seed=None)[source]¶

Performs one iteration of RL training.

Parameters:

action (Configuration | dict) – Hyperparameter configuration to use for training.
n_total_timesteps (int | None, optional) – Number of total training steps. Defaults to None.
n_eval_steps (int | None, optional) – Number of evaluations during training. Defaults to None.
n_eval_episodes (int | None, optional) – Number of episodes to run per evalution during training. Defaults to None.
seed (int | None, optional) – Random seed. Defaults to None. If None, seed of the AutoRL environment is used.

Raises:

ValueError – Error is raised if step() is called before reset() was called.

Returns:

State information, objectives, terminated, truncated, additional information.

Return type:

tuple[ObservationT, ObjectivesT, bool, bool, InfoT]

Modules

`arlbench.autorl.autorl_env`	Automated Reinforcement Learning Environment.
`arlbench.autorl.checkpointing`	Contains all checkpointing-related methods for the AutoRL environment.
`arlbench.autorl.objectives`	This module contains the objectives for the AutoRL environment.
`arlbench.autorl.state_features`	State features for the AutoRL environment.

arlbench.arlbench

arlbench.autorl.autorl_env

ARLBench Documentation

arlbench.autorl¶