arlbench.core.algorithms.prioritised_item_buffer¶

Prioritised replay buffer.

Functions

`create_prioritised_item_buffer`(max_length, ...)	Creates a prioritised trajectory buffer that acts as an independent item buffer.
`make_prioritised_item_buffer`(max_length, ...)	Makes a prioritised trajectory buffer act as a independent item buffer.

arlbench.core.algorithms.prioritised_item_buffer.create_prioritised_item_buffer(max_length, min_length, sample_batch_size, add_sequences, add_batches, priority_exponent, device)[source]¶

Creates a prioritised trajectory buffer that acts as an independent item buffer.

Parameters:

max_length (int) – The maximum length of the buffer.
min_length (int) – The minimum length of the buffer.
sample_batch_size (int) – The batch size of the samples.
add_sequences (Optional[bool], optional) – Whether data is being added in sequences to the buffer. If False, single items are being added each time add is called. Defaults to False.
add_batches (bool) – (Optional[bool], optional): Whether adding data in batches to the buffer. If False, single items (or single sequences of items) are being added each time add is called. Defaults to False.
priority_exponent (float) – Priority exponent for sampling. Equivalent to alpha in the PER paper.
device (str) – “tpu”, “gpu” or “cpu”. Depending on chosen device, more optimal functions will be used to perform the buffer operations.

Return type:

PrioritisedTrajectoryBuffer

Returns: The buffer.

arlbench.core.algorithms.prioritised_item_buffer.make_prioritised_item_buffer(max_length, min_length, sample_batch_size, add_sequences=False, add_batches=False, priority_exponent=0.6, device='cpu')[source]¶

Makes a prioritised trajectory buffer act as a independent item buffer.

Parameters:

max_length (int) – The maximum length of the buffer.
min_length (int) – The minimum length of the buffer.
sample_batch_size (int) – The batch size of the samples.
add_sequences (Optional[bool], optional) – Whether data is being added in sequences to the buffer. If False, single items are being added each time add is called. Defaults to False.
add_batches (bool) – (Optional[bool], optional): Whether adding data in batches to the buffer. If False, single transitions or single sequences are being added each time add is called. Defaults to False.
priority_exponent (float) – Priority exponent for sampling. Equivalent to alpha in the PER paper.
device (str) – “tpu”, “gpu” or “cpu”. Depending on chosen device, more optimal functions will be used to perform the buffer operations.

Return type:

PrioritisedTrajectoryBuffer

Returns: The buffer.

arlbench.core.algorithms.ppo.ppo

arlbench.core.algorithms.sac

ARLBench Documentation

arlbench.core.algorithms.prioritised_item_buffer¶