Objectives & Features in ARLBench¶

ARLBench allows to configure the objectives you’d like to use for your AutoRL methods. These are selected as a list of keywords in the configuration of the AutoRL Environment, e.g. like this:

python arlbench.py autorl.objectives=["reward_mean", "discounted_train_reward_mean_gamma_0.9"]

The following objectives are available at the moment:

reward_mean: the mean evaluation reward across a number of evaluation episodes
discounted_reward_mean: the discounted mean evaluation reward across a number of evaluation episodes. The default gamma here is 0.99, but you can specify your own by appending “_gamma_<value>” to the objective name (e.g. discounted_reward_mean_gamma_0.8)
reward_std: the standard deviation of the evaluation rewards across a number of evaluation episodes
train_reward_mean: the mean training reward across a number of training episodes
discounted_train_reward_mean: the discounted mean training reward across a number of training episodes. The default gamma here is 0.99, but you can specify your own by appending “_gamma_<value>” to the objective name (e.g. discounted_train_reward_mean_gamma_0.8)
train_reward_std: the standard deviation of the training rewards across a number of training
runtime: the runtime of the training process
emissions: the CO2 emissions of the training process, tracked using CodeCarbon.

Features work similarly and are intended to be used as additional information about the training run. You can select them via the ‘state_features’ key in the configuration of the AutoRL Environment, e.g. like this:

python arlbench.py autorl.state_features=["loss_info", "grad_info"]

The following features are available at the moment: - grad_info: information about the gradients during training, i.e. their norm and variance - loss_info: information about the loss during training, i.e. its mean and standard deviation - weight_info: information about the weights of the neural networks used in the RL algorithm, i.e. the norm and variance of the weights and biases in each network - prediction_info: information about the predictions of the neural networks used in the RL algorithm, i.e. the mean and standard deviation of the outputs of each network (like Q-values or log-probs)

Benchmarking AutoRL Methods

The ARLBench Subsets

ARLBench Documentation

Objectives & Features in ARLBench¶