CARL Classic Control Environments¶
Classic Control is a problem suite included in OpenAI’s gym consisting of simply physics simulation tasks. Context features here are therefore also physics-based, e.g. friction, mass or gravity.
CARL Pendulum Environment¶
In Pendulum, the agent’s task is to swing up an inverted pendulum and balance it at the top from a random position. The action here is the direction and amount of force the agent wants to apply to the pendulum.
Context Feature |
Default |
Bounds |
|---|---|---|
max_speed |
8.0 |
(-inf, inf, <class ‘float’>) |
dt |
0.05 |
(0, inf, <class ‘float’>) |
g |
10.0 |
(0, inf, <class ‘float’>) |
m |
1.0 |
(1e-06, inf, <class ‘float’>) |
l |
1.0 |
(1e-06, inf, <class ‘float’>) |
CARL CartPole Environment¶
CartPole, similarly to Pendulum, asks the agent to balance a pole upright, though this time the agent doesn’t directly apply force to the pole but moves a cart on which the pole ist placed either to the left or the right.
Context Feature |
Default |
Bounds |
|---|---|---|
gravity |
9.8 |
(0.1, inf, <class ‘float’>) |
masscart |
1.0 |
(0.1, 10, <class ‘float’>) |
masspole |
0.1 |
(0.01, 1, <class ‘float’>) |
pole_length |
0.5 |
(0.05, 5, <class ‘float’>) |
force_magnifier |
10.0 |
(1, 100, <class ‘int’>) |
update_interval |
0.02 |
(0.002, 0.2, <class ‘float’>) |
CARL Acrobot Environment¶
Acrobot is another swing-up task with the goal being swinging the end of the lower of two links up to a given height. The agent accomplishes this by actuating the joint connecting both links.
Context Feature |
Default |
Bounds |
|---|---|---|
link_length_1 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_length_2 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_mass_1 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_mass_2 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_com_1 |
0.5 |
(0, 1, <class ‘float’>) |
link_com_2 |
0.5 |
(0, 1, <class ‘float’>) |
link_moi |
1.0 |
(0.1, 10, <class ‘float’>) |
max_velocity_1 |
12.566370614359172 |
(1.2566370614359172, 125.66370614359172, <class ‘float’>) |
max_velocity_2 |
28.274333882308138 |
(2.827433388230814, 282.7433388230814, <class ‘float’>) |
torque_noise_max |
0.0 |
(-1.0, 1.0, <class ‘float’>) |
CARL MountainCar Environment¶
The MountainCar environment asks the agent to move a car up a steep slope. In order to succeed, the agent has to accelerate using the opposite slope. There are two versions of the environment, a discrete one with only “left” and “right” as actions, as well as a continuous one.
Defaults and bounds for the discrete MountainCar:
Context Feature |
Default |
Bounds |
|---|---|---|
min_position |
-1.2 |
(-inf, inf, <class ‘float’>) |
max_position |
0.6 |
(-inf, inf, <class ‘float’>) |
max_speed |
0.07 |
(0, inf, <class ‘float’>) |
goal_position |
0.5 |
(-inf, inf, <class ‘float’>) |
goal_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
force |
0.001 |
(-inf, inf, <class ‘float’>) |
gravity |
0.0025 |
(0, inf, <class ‘float’>) |
start_position |
-0.5 |
(-1.5, 0.5, <class ‘float’>) |
start_position_std |
0.1 |
(0.1, inf, <class ‘float’>) |
start_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
start_velocity_std |
0.0 |
(0.1, inf, <class ‘float’>) |
And for the continuous case:
Context Feature |
Default |
Bounds |
|---|---|---|
min_position |
-1.2 |
(-inf, inf, <class ‘float’>) |
max_position |
0.6 |
(-inf, inf, <class ‘float’>) |
max_speed |
0.07 |
(0, inf, <class ‘float’>) |
goal_position |
0.45 |
(-inf, inf, <class ‘float’>) |
goal_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
power |
0.0015 |
(-inf, inf, <class ‘float’>) |
min_position_start |
-0.6 |
(-inf, inf, <class ‘float’>) |
max_position_start |
-0.4 |
(-inf, inf, <class ‘float’>) |
min_velocity_start |
0.0 |
(-inf, inf, <class ‘float’>) |
max_velocity_start |
0.0 |
(-inf, inf, <class ‘float’>) |