arlbench.core.algorithms.prioritised_item_buffer¶
Prioritised replay buffer.
Functions
|
Creates a prioritised trajectory buffer that acts as an independent item buffer. |
|
Makes a prioritised trajectory buffer act as a independent item buffer. |
- arlbench.core.algorithms.prioritised_item_buffer.create_prioritised_item_buffer(max_length, min_length, sample_batch_size, add_sequences, add_batches, priority_exponent, device)[source]¶
Creates a prioritised trajectory buffer that acts as an independent item buffer.
- Parameters:
max_length (int) – The maximum length of the buffer.
min_length (int) – The minimum length of the buffer.
sample_batch_size (int) – The batch size of the samples.
add_sequences (Optional[bool], optional) – Whether data is being added in sequences to the buffer. If False, single items are being added each time add is called. Defaults to False.
add_batches (
bool
) – (Optional[bool], optional): Whether adding data in batches to the buffer. If False, single items (or single sequences of items) are being added each time add is called. Defaults to False.priority_exponent (
float
) – Priority exponent for sampling. Equivalent to alpha in the PER paper.device (
str
) – “tpu”, “gpu” or “cpu”. Depending on chosen device, more optimal functions will be used to perform the buffer operations.
- Return type:
PrioritisedTrajectoryBuffer
Returns: The buffer.
- arlbench.core.algorithms.prioritised_item_buffer.make_prioritised_item_buffer(max_length, min_length, sample_batch_size, add_sequences=False, add_batches=False, priority_exponent=0.6, device='cpu')[source]¶
Makes a prioritised trajectory buffer act as a independent item buffer.
- Parameters:
max_length (int) – The maximum length of the buffer.
min_length (int) – The minimum length of the buffer.
sample_batch_size (int) – The batch size of the samples.
add_sequences (Optional[bool], optional) – Whether data is being added in sequences to the buffer. If False, single items are being added each time add is called. Defaults to False.
add_batches (
bool
) – (Optional[bool], optional): Whether adding data in batches to the buffer. If False, single transitions or single sequences are being added each time add is called. Defaults to False.priority_exponent (
float
) – Priority exponent for sampling. Equivalent to alpha in the PER paper.device (
str
) – “tpu”, “gpu” or “cpu”. Depending on chosen device, more optimal functions will be used to perform the buffer operations.
- Return type:
PrioritisedTrajectoryBuffer
Returns: The buffer.