arlbench.core.environments.autorl_env

AutoRL Environment module.

Classes

Environment(env_name, env, n_envs[, seed])

An abstract environment class to support various kinds of RL environments.

class arlbench.core.environments.autorl_env.Environment(env_name, env, n_envs, seed=None)[source]

Bases: ABC

An abstract environment class to support various kinds of RL environments.

Sub-classes need to implement the following methods:
  • reset()

  • step()

Note: Both functions need to be fully jittable to support JAX-based RL algorithms!

As well as the properties:
  • action_space

  • observation_space

Note: These need to be gymnax spaces, not gymnasium spaces.

abstract action_space()[source]

The action space of the environment (gymnax space).

Returns:

Action space of the environment.

Return type:

gymnax.environments.spaces.Space

property env_name: str

Returns the name/id of the environments.

Returns:

Environment name.

Return type:

str

property n_envs: int

The number of environments.

Returns:

_description_

Return type:

int

abstract observation_space()[source]

The observation space of the environment (gymnax space).

Returns:

Observation space of the environment.

Return type:

gymnax.environments.spaces.Space

abstract reset(rng)[source]

Environment reset() function. Resets the internal environment state.

Parameters:

rng (PRNGKey) – Random number generator key.

Returns:

Returns a tuple containing the environment state

as well as the actual return of the reset() function.

Return type:

tuple[Any, Any]

sample_actions(rng)[source]

Samples a random action for each environment.

Parameters:

rng (PRNGKey) – Random number generator key.

Returns:

Array of sampled actions, one for each environment.

Return type:

jnp.ndarray

abstract step(env_state, action, rng)[source]
Environment step() function. Performs a step

in the environment given an action.

Parameters:
  • env_state (Any) – Internal environment state.

  • action (Any) – Action to take.

  • rng (PRNGKey) – Random number generator key.

Returns:

Returns a tuple containing the environment state

as well as the actual return of the step() function.

Return type:

tuple[Any, Any]