Queue monitor
Queue Monitor#
A QueueMonitor
is a
monitor for the scheduler queue.
This module contains a monitor for the scheduler queue. The monitor tracks the queue state at every event emitted by the scheduler. The data can be converted to a pandas DataFrame or plotted as a stacked barchart.
Monitoring Frequency
To prevent repeated polling, we sample the scheduler queue at every scheduler event.
This is because the queue is only modified upon one of these events. This means we
don't need to poll the queue at a fixed interval. However, if you need more fine
grained updates, you can add extra events/timings at which the monitor should
update()
.
Performance impact
If your tasks and callbacks are very fast (~sub 10ms), then the monitor has a non-nelgible impact however for most use cases, this should not be a problem. As anything, you should profile how much work the scheduler can get done, with and without the monitor, to see if it is a problem for your use case.
In the below example, we have a very fast running function that runs on repeat, sometimes too fast for the scheduler to keep up, letting some futures buildup needing to be processed.
import time
import matplotlib.pyplot as plt
from amltk.scheduling import Scheduler
from amltk.scheduling.queue_monitor import QueueMonitor
def fast_function(x: int) -> int:
return x + 1
N_WORKERS = 2
scheduler = Scheduler.with_processes(N_WORKERS)
monitor = QueueMonitor(scheduler)
task = scheduler.task(fast_function)
@scheduler.on_start(repeat=N_WORKERS)
def start():
task.submit(1)
@task.on_result
def result(_, x: int):
if scheduler.running():
task.submit(x)
scheduler.run(timeout=1)
df = monitor.df()
print(df)
queue_size queued finished cancelled idle
time
2024-08-13 07:34:55.853111673 0 0 0 0 2
2024-08-13 07:34:55.872440325 1 1 0 0 1
2024-08-13 07:34:55.872990541 2 2 0 0 0
2024-08-13 07:34:55.877353082 1 0 1 0 1
2024-08-13 07:34:55.877504454 2 1 1 0 0
... ... ... ... ... ...
2024-08-13 07:34:56.873615563 2 2 0 0 0
2024-08-13 07:34:56.873897389 2 2 0 0 0
2024-08-13 07:34:56.885472047 2 2 0 0 0
2024-08-13 07:34:56.885893213 1 0 1 0 1
2024-08-13 07:34:56.885936895 0 0 0 0 2
[4859 rows x 5 columns]
We can also plot()
the data as a
stacked barchart with a set interval.