Functionality through Wrappers¶
In order to comfortably provide additional functionality to environments without changing the interface, we can use so-called wrappers. They execute environment resets and steps internally, but can either alter the environment behavior (e.g. by adding noise) or record information about the environment. To wrap an existing environment is simple:
from dacbench.wrappers import PerformanceTrackingWrapper
wrapped_env = PerformanceTrackingWrapper(env)
The provided environments for tracking performance, state and action information are designed to be used with DACBench’s logging functionality.
- class dacbench.wrappers.ActionFrequencyWrapper(env, action_interval=None, logger=None)[source]¶
Bases:
Wrapper
Wrapper to action frequency.
Includes interval mode that returns frequencies in lists of len(interval) instead of one long list.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- get_actions()[source]¶
Get state progression.
Returns:¶
- np.array or np.array, np.array
all states or all states and interval sorted states
- class dacbench.wrappers.EpisodeTimeWrapper(env, time_interval=None, logger=None)[source]¶
Bases:
Wrapper
Wrapper to track time spent per episode.
Includes interval mode that returns times in lists of len(interval) instead of one long list.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- get_times()[source]¶
Get times.
Returns:¶
- np.array or np.array, np.array
all times or all times and interval sorted times
- step(action)[source]¶
Execute environment step and record time.
- Parameters:
action (int) – action to execute
Returns
-------
np.array – state, reward, terminated, truncated, metainfo
float – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
dict – state, reward, terminated, truncated, metainfo
- class dacbench.wrappers.InstanceSamplingWrapper(env, sampling_function=None, instances=None, reset_interval=0)[source]¶
Bases:
Wrapper
Wrapper to sample a new instance at a given time point.
Instances can either be sampled using a given method or a distribution infered from a given list of instances.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- class dacbench.wrappers.MultiDiscreteActionWrapper(env)[source]¶
Bases:
Wrapper
Wrapper to cast MultiDiscrete action spaces to Discrete. This should improve usability with standard RL libraries.
- class dacbench.wrappers.ObservationWrapper(env)[source]¶
Bases:
Wrapper
Wrapper convert observations spaces to spaces.Box for convenience.
Currently only supports Dict -> Box
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- reset(seed=None, options=None)[source]¶
Execute environment step and record distance.
Returns:¶
- np.array, dict
state, info
- step(action)[source]¶
Execute environment step and record distance.
- Parameters:
action (int) – action to execute
Returns
-------
np.array – state, reward, terminated, truncated, metainfo
float – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
dict – state, reward, terminated, truncated, metainfo
- class dacbench.wrappers.PerformanceTrackingWrapper(env, performance_interval=None, track_instance_performance=True, logger=None)[source]¶
Bases:
Wrapper
Wrapper to track episode performance.
Includes interval mode that returns performance in lists of len(interval) instead of one long list.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- class dacbench.wrappers.PolicyProgressWrapper(env, compute_optimal)[source]¶
Bases:
Wrapper
Wrapper to track progress towards optimal policy.
Can only be used if a way to obtain the optimal policy given an instance can be obtained.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- step(action)[source]¶
Execute environment step and record distance.
- Parameters:
action (int) – action to execute
Returns
-------
np.array – state, reward, terminated, truncated, metainfo
float – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
dict – state, reward, terminated, truncated, metainfo
- class dacbench.wrappers.RewardNoiseWrapper(env, noise_function=None, noise_dist='standard_normal', dist_args=None)[source]¶
Bases:
Wrapper
Wrapper to add noise to the reward signal.
Noise can be sampled from a custom distribution or any distribution in numpy’s random module.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- add_noise(dist, args)[source]¶
Make noise function from distribution name and arguments.
- Parameters:
dist (str) – Name of distribution
args (list) – List of distribution arguments
Returns
-------
function – Noise sampling function
- step(action)[source]¶
Execute environment step and add noise.
- Parameters:
action (int) – action to execute
Returns
-------
np.array – state, reward, terminated, truncated, metainfo
float – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
bool – state, reward, terminated, truncated, metainfo
dict – state, reward, terminated, truncated, metainfo
- class dacbench.wrappers.StateTrackingWrapper(env, state_interval=None, logger=None)[source]¶
Bases:
Wrapper
Wrapper to track state changed over time.
Includes interval mode that returns states in lists of len(interval) instead of one long list.
- __getattribute__(name)[source]¶
Get attribute value of wrapper if available and of env if not.
- Parameters:
name (str) – Attribute to get
Returns
-------
value – Value of given name
- __setattr__(name, value)[source]¶
Set attribute in wrapper if available and in env if not.
- Parameters:
name (str) – Attribute to set
value – Value to set attribute to
- get_states()[source]¶
Get state progression.
Returns:¶
- np.array or np.array, np.array
all states or all states and interval sorted states
- render_state_tracking()[source]¶
Render state progression.
Returns:¶
- np.array
RBG data of state tracking