arlbench.core.algorithms.sac.models¶
SAC models for the actor and critic networks.
Classes
|
Alpha coefficient for SAC. |
|
A CNN-based actor network for SAC. |
|
A CNN-based critic network for SAC. |
|
An MLP-based actor network for PPO. |
|
An MLP-based critic network for SAC. |
|
A vectorized critic network for SAC. |
|
Tanh transformation of a distrax distribution. |
- class arlbench.core.algorithms.sac.models.AlphaCoef(alpha_init=1.0, parent=<flax.linen.module._Sentinel object>, name=None)[source]¶
Bases:
Module
Alpha coefficient for SAC.
- class arlbench.core.algorithms.sac.models.SACCNNActor(action_dim, activation, hidden_size=64, log_std_min=-20, log_std_max=2, parent=<flax.linen.module._Sentinel object>, name=None)[source]¶
Bases:
Module
A CNN-based actor network for SAC. Based on NatureCNN https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/torch_layers.py#L48.
- class arlbench.core.algorithms.sac.models.SACCNNCritic(action_dim, activation, hidden_size=512, parent=<flax.linen.module._Sentinel object>, name=None)[source]¶
Bases:
Module
A CNN-based critic network for SAC. Based on NatureCNN https://github.com/DLR-RM/stable-baselines3/blob/master/stable_baselines3/common/torch_layers.py#L48.
- class arlbench.core.algorithms.sac.models.SACMLPActor(action_dim, activation, hidden_size=64, log_std_min=-20, log_std_max=2, parent=<flax.linen.module._Sentinel object>, name=None)[source]¶
Bases:
Module
An MLP-based actor network for PPO.
- class arlbench.core.algorithms.sac.models.SACMLPCritic(action_dim, activation, hidden_size=64, parent=<flax.linen.module._Sentinel object>, name=None)[source]¶
Bases:
Module
An MLP-based critic network for SAC.