CARL Classic Control Environments¶
Classic Control is a problem suite included in OpenAI’s gym consisting of simply physics simulation tasks. Context features here are therefore also physics-based, e.g. friction, mass or gravity.
CARL Pendulum Environment¶
![Pendulum Environment](../../../_images/pendulum.jpeg)
In Pendulum, the agent’s task is to swing up an inverted pendulum and balance it at the top from a random position. The action here is the direction and amount of force the agent wants to apply to the pendulum. Influence of context settings on an agent trained on the default environment:
![Influence of context settings on an agent trained on the default environment.](../../../_images/plot_ecdf_CARLPendulumEnv.png)
Context Feature |
Default |
Bounds |
---|---|---|
max_speed |
8.0 |
(-inf, inf, <class ‘float’>) |
dt |
0.05 |
(0, inf, <class ‘float’>) |
g |
10.0 |
(0, inf, <class ‘float’>) |
m |
1.0 |
(1e-06, inf, <class ‘float’>) |
l |
1.0 |
(1e-06, inf, <class ‘float’>) |
CARL CartPole Environment¶
![CartPole Environment](../../../_images/cartpole.jpeg)
CartPole, similarly to Pendulum, asks the agent to balance a pole upright, though this time the agent doesn’t directly apply force to the pole but moves a cart on which the pole ist placed either to the left or the right. Influence of context settings on an agent trained on the default environment:
![Influence of context settings on an agent trained on the default environment.](../../../_images/plot_ecdf_CARLCartPoleEnv.png)
Context Feature |
Default |
Bounds |
---|---|---|
gravity |
9.8 |
(0.1, inf, <class ‘float’>) |
masscart |
1.0 |
(0.1, 10, <class ‘float’>) |
masspole |
0.1 |
(0.01, 1, <class ‘float’>) |
pole_length |
0.5 |
(0.05, 5, <class ‘float’>) |
force_magnifier |
10.0 |
(1, 100, <class ‘int’>) |
update_interval |
0.02 |
(0.002, 0.2, <class ‘float’>) |
CARL Acrobot Environment¶
![Acrobot Environment](../../../_images/acrobot.jpeg)
Acrobot is another swing-up task with the goal being swinging the end of the lower of two links up to a given height. The agent accomplishes this by actuating the joint connecting both links. Influence of context settings on an agent trained on the default environment:
![Influence of context settings on an agent trained on the default environment.](../../../_images/plot_ecdf_CARLAcrobotEnv.png)
Context Feature |
Default |
Bounds |
---|---|---|
link_length_1 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_length_2 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_mass_1 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_mass_2 |
1.0 |
(0.1, 10, <class ‘float’>) |
link_com_1 |
0.5 |
(0, 1, <class ‘float’>) |
link_com_2 |
0.5 |
(0, 1, <class ‘float’>) |
link_moi |
1.0 |
(0.1, 10, <class ‘float’>) |
max_velocity_1 |
12.566370614359172 |
(1.2566370614359172, 125.66370614359172, <class ‘float’>) |
max_velocity_2 |
28.274333882308138 |
(2.827433388230814, 282.7433388230814, <class ‘float’>) |
torque_noise_max |
0.0 |
(-1.0, 1.0, <class ‘float’>) |
CARL MountainCar Environment¶
![MountainCar Environment](../../../_images/mountaincar.jpeg)
The MountainCar environment asks the agent to move a car up a steep slope. In order to succeed, the agent has to accelerate using the opposite slope. There are two versions of the environment, a discrete one with only “left” and “right” as actions, as well as a continuous one. Influence of context settings on an agent trained on the default environment:
![Influence of context settings on an agent trained on the default environment.](../../../_images/plot_ecdf_CARLMountainCarEnv.png)
Defaults and bounds for the discrete MountainCar:
Context Feature |
Default |
Bounds |
---|---|---|
min_position |
-1.2 |
(-inf, inf, <class ‘float’>) |
max_position |
0.6 |
(-inf, inf, <class ‘float’>) |
max_speed |
0.07 |
(0, inf, <class ‘float’>) |
goal_position |
0.5 |
(-inf, inf, <class ‘float’>) |
goal_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
force |
0.001 |
(-inf, inf, <class ‘float’>) |
gravity |
0.0025 |
(0, inf, <class ‘float’>) |
start_position |
-0.5 |
(-1.5, 0.5, <class ‘float’>) |
start_position_std |
0.1 |
(0.1, inf, <class ‘float’>) |
start_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
start_velocity_std |
0.0 |
(0.1, inf, <class ‘float’>) |
And for the continuous case:
Context Feature |
Default |
Bounds |
---|---|---|
min_position |
-1.2 |
(-inf, inf, <class ‘float’>) |
max_position |
0.6 |
(-inf, inf, <class ‘float’>) |
max_speed |
0.07 |
(0, inf, <class ‘float’>) |
goal_position |
0.45 |
(-inf, inf, <class ‘float’>) |
goal_velocity |
0.0 |
(-inf, inf, <class ‘float’>) |
power |
0.0015 |
(-inf, inf, <class ‘float’>) |
min_position_start |
-0.6 |
(-inf, inf, <class ‘float’>) |
max_position_start |
-0.4 |
(-inf, inf, <class ‘float’>) |
min_velocity_start |
0.0 |
(-inf, inf, <class ‘float’>) |
max_velocity_start |
0.0 |
(-inf, inf, <class ‘float’>) |