Sac
mighty.mighty_models.sac
#
SACModel
#
SACModel(
obs_size: int,
action_size: int,
hidden_sizes: list[int] = [64, 64],
activation: str = "relu",
log_std_min: float = -20,
log_std_max: float = 2,
)
Bases: Module
SAC Model with squashed Gaussian policy and twin Q-networks.
Source code in mighty/mighty_models/sac.py
forward
#
Forward pass for policy sampling.
RETURNS | DESCRIPTION |
---|---|
action
|
torch.Tensor in [-1,1] z: raw Gaussian sample before tanh mean: Gaussian mean log_std: Gaussian log std |
Source code in mighty/mighty_models/sac.py
policy_log_prob
#
Compute log-prob of action a = tanh(z), correcting for tanh transform.