arlbench package¶
Subpackages¶
- arlbench.autorl package
- arlbench.core package
- Subpackages
- arlbench.core.algorithms package
- arlbench.core.environments package
- Submodules
- arlbench.core.environments.autorl_env module
- arlbench.core.environments.brax_env module
- arlbench.core.environments.envpool_env module
- arlbench.core.environments.gymnasium_env module
- arlbench.core.environments.gymnax_env module
- arlbench.core.environments.make_env module
- arlbench.core.environments.xland_env module
- Module contents
- arlbench.core.wrappers package
- Submodules
- arlbench.core.running_statistics module
- Module contents
- Subpackages
- arlbench.utils package
Submodules¶
arlbench.arlbench module¶
This module provides a function to run ARLBench using a given config.
Module contents¶
Top-level package for ARLBench.
- class arlbench.AutoRLEnv(config=None)[source]¶
Bases:
Env
Automated Reinforcement Learning (gynmasium-like) Environment.
With each reset, the algorithm state is (re-)initialized. If a checkpoint path is passed to reset, the agent state is initialized with the checkpointed state.
In each step, one iteration of training is performed with the current hyperparameter configuration (= action).
- ALGORITHMS = {'dqn': <class 'arlbench.core.algorithms.dqn.dqn.DQN'>, 'ppo': <class 'arlbench.core.algorithms.ppo.ppo.PPO'>, 'sac': <class 'arlbench.core.algorithms.sac.sac.SAC'>}¶
- property action_space: Space¶
Returns the hyperparameter configuration spaces as gymnasium space.
- Returns:
Hyperparameter configuration space.
- Return type:
gymnasium.spaces.Space
- property checkpoints: list[str]¶
Returns a list of created checkpoints for this AutoRL environment.
- Returns:
List of checkpoint paths.
- Return type:
list[str]
- property config: dict¶
Returns the AutoRL configuration.
- Returns:
AutoRL configuration.
- Return type:
dict
- property config_space: ConfigurationSpace¶
Returns the hyperparameter configuration spaces as ConfigSpace.
- Returns:
Hyperparameter configuration space.
- Return type:
ConfigurationSpace
- eval(num_eval_episodes)[source]¶
Evaluates the algorithm using its current training state.
- Parameters:
num_eval_episodes (int) – Number of evaluation episodes to run.
- Returns:
Array of evaluation return for each episode.
- Return type:
np.ndarray
- property hpo_config: Configuration¶
Returns the current hyperparameter configuration stored in the AutoRL environment..
- Returns:
Hyperparameter configuration.
- Return type:
Configuration
- property objectives: list[str]¶
Returns configured objectives.
- Returns:
List of objectives.
- Return type:
list[str]
- property observation_space: Space¶
Returns a gymnasium spaces of state features (observations).
- Returns:
Gynasium space.
- Return type:
gymnasium.spaces.Space
- reset()[source]¶
Resets the AutoRL environment and current algorithm state.
- Returns:
Empty observation and state information.
- Return type:
tuple[ObservationT, InfoT]
- step(action, checkpoint_path=None, n_total_timesteps=None, n_eval_steps=None, n_eval_episodes=None, seed=None)[source]¶
Performs one iteration of RL training.
- Parameters:
action (Configuration | dict) – Hyperparameter configuration to use for training.
n_total_timesteps (int | None, optional) – Number of total training steps. Defaults to None.
n_eval_steps (int | None, optional) – Number of evaluations during training. Defaults to None.
n_eval_episodes (int | None, optional) – Number of episodes to run per evalution during training. Defaults to None.
seed (int | None, optional) – Random seed. Defaults to None. If None, seed of the AutoRL environment is used.
- Raises:
ValueError – Error is raised if step() is called before reset() was called.
- Returns:
State information, objectives, terminated, truncated, additional information.
- Return type:
tuple[ObservationT, ObjectivesT, bool, bool, InfoT]