CARL Classic Control Environments

Classic Control is a problem suite included in OpenAI’s gym consisting of simply physics simulation tasks. Context features here are therefore also physics-based, e.g. friction, mass or gravity.

CARL Pendulum Environment

Pendulum Environment

In Pendulum, the agent’s task is to swing up an inverted pendulum and balance it at the top from a random position. The action here is the direction and amount of force the agent wants to apply to the pendulum.

Defaults and Bounds

Context Feature

Default

Bounds

max_speed

8.0

(-inf, inf, <class ‘float’>)

dt

0.05

(0, inf, <class ‘float’>)

g

10.0

(0, inf, <class ‘float’>)

m

1.0

(1e-06, inf, <class ‘float’>)

l

1.0

(1e-06, inf, <class ‘float’>)

CARL CartPole Environment

CartPole Environment

CartPole, similarly to Pendulum, asks the agent to balance a pole upright, though this time the agent doesn’t directly apply force to the pole but moves a cart on which the pole ist placed either to the left or the right.

Defaults and Bounds

Context Feature

Default

Bounds

gravity

9.8

(0.1, inf, <class ‘float’>)

masscart

1.0

(0.1, 10, <class ‘float’>)

masspole

0.1

(0.01, 1, <class ‘float’>)

pole_length

0.5

(0.05, 5, <class ‘float’>)

force_magnifier

10.0

(1, 100, <class ‘int’>)

update_interval

0.02

(0.002, 0.2, <class ‘float’>)

CARL Acrobot Environment

Acrobot Environment

Acrobot is another swing-up task with the goal being swinging the end of the lower of two links up to a given height. The agent accomplishes this by actuating the joint connecting both links.

Defaults and Bounds

Context Feature

Default

Bounds

link_length_1

1.0

(0.1, 10, <class ‘float’>)

link_length_2

1.0

(0.1, 10, <class ‘float’>)

link_mass_1

1.0

(0.1, 10, <class ‘float’>)

link_mass_2

1.0

(0.1, 10, <class ‘float’>)

link_com_1

0.5

(0, 1, <class ‘float’>)

link_com_2

0.5

(0, 1, <class ‘float’>)

link_moi

1.0

(0.1, 10, <class ‘float’>)

max_velocity_1

12.566370614359172

(1.2566370614359172, 125.66370614359172, <class ‘float’>)

max_velocity_2

28.274333882308138

(2.827433388230814, 282.7433388230814, <class ‘float’>)

torque_noise_max

0.0

(-1.0, 1.0, <class ‘float’>)

CARL MountainCar Environment

MountainCar Environment

The MountainCar environment asks the agent to move a car up a steep slope. In order to succeed, the agent has to accelerate using the opposite slope. There are two versions of the environment, a discrete one with only “left” and “right” as actions, as well as a continuous one.

Defaults and bounds for the discrete MountainCar:

Defaults and Bounds

Context Feature

Default

Bounds

min_position

-1.2

(-inf, inf, <class ‘float’>)

max_position

0.6

(-inf, inf, <class ‘float’>)

max_speed

0.07

(0, inf, <class ‘float’>)

goal_position

0.5

(-inf, inf, <class ‘float’>)

goal_velocity

0.0

(-inf, inf, <class ‘float’>)

force

0.001

(-inf, inf, <class ‘float’>)

gravity

0.0025

(0, inf, <class ‘float’>)

start_position

-0.5

(-1.5, 0.5, <class ‘float’>)

start_position_std

0.1

(0.1, inf, <class ‘float’>)

start_velocity

0.0

(-inf, inf, <class ‘float’>)

start_velocity_std

0.0

(0.1, inf, <class ‘float’>)

And for the continuous case:

Defaults and Bounds

Context Feature

Default

Bounds

min_position

-1.2

(-inf, inf, <class ‘float’>)

max_position

0.6

(-inf, inf, <class ‘float’>)

max_speed

0.07

(0, inf, <class ‘float’>)

goal_position

0.45

(-inf, inf, <class ‘float’>)

goal_velocity

0.0

(-inf, inf, <class ‘float’>)

power

0.0015

(-inf, inf, <class ‘float’>)

min_position_start

-0.6

(-inf, inf, <class ‘float’>)

max_position_start

-0.4

(-inf, inf, <class ‘float’>)

min_velocity_start

0.0

(-inf, inf, <class ‘float’>)

max_velocity_start

0.0

(-inf, inf, <class ‘float’>)