Profiler
Whether for debugging, building an AutoML system or for optimization
purposes, we provide a powerful Profiler
,
which can generate a Profile
of different sections
of code. This is particularly useful with Trial
s,
so much so that we attach one to every Trial
made as
trial.profiler
.
When done profiling, you can export all generated profiles as a dataframe using
profiler.df()
.
from amltk.profiling import Profiler
import numpy as np
profiler = Profiler()
with profiler("loading-data"):
X = np.random.rand(1000, 1000)
with profiler("training-model"):
model = np.linalg.inv(X)
with profiler("predicting"):
y = model @ X
print(profiler.df())
You'll find these profiles as keys in the Profiler
,
e.g. python profiler["loading-data"]
.
This will measure both the time it took within the block but also the memory consumed before and after the block finishes, allowing you to get an estimate of the memory consumed.
Memory, vms vs rms
While not entirely accurate, this should be enough for info for most use cases.
Given the main process uses 2GB of memory and the process
then spawns a new process in which you are profiling, as you
might do from a Task
. In this new
process you use another 2GB on top of that, then:
-
The virtual memory size (vms) will show 4GB as the new process will share the 2GB with the main process and have it's own 2GB.
-
The resident set size (rss) will show 2GB as the new process will only have 2GB of it's own memory.
If you need to profile some iterator, like a for loop, you can use
Profiler.each()
which will measure
the entire loop but also each individual iteration. This can be useful
for iterating batches of a deep-learning model, splits of a cross-validator
or really any loop with work you want to profile.
from amltk.profiling import Profiler
import numpy as np
profiler = Profiler()
for i in profiler.each(range(3), name="for-loop"):
X = np.random.rand(1000, 1000)
print(profiler.df())
Lastly, to disable profiling without editing much code,
you can always use Profiler.disable()
and Profiler.enable()
to toggle
profiling on and off.
class Profile
dataclass
#
A profiler for measuring statistics between two events.
class Interval
dataclass
#
A class for representing a profiled interval.
def to_dict(*, prefix='')
#
Convert the profile interval to a dictionary.
Source code in src/amltk/profiling/profiler.py
def from_dict(d)
classmethod
#
Create a profile interval from a dictionary.
PARAMETER | DESCRIPTION |
---|---|
d |
The dictionary to create from. |
RETURNS | DESCRIPTION |
---|---|
Interval
|
The profile interval. |
Source code in src/amltk/profiling/profiler.py
def measure(*, memory_unit='B', time_kind='wall')
classmethod
#
Profile a block of code.
PARAMETER | DESCRIPTION |
---|---|
memory_unit |
The unit of memory to use. |
time_kind |
The type of timer to use.
TYPE:
|
YIELDS | DESCRIPTION |
---|---|
Interval
|
The Profiler Interval. Memory and Timings will not be valid until |
Interval
|
the context manager is exited. |
Source code in src/amltk/profiling/profiler.py
def start(memory_unit='B', time_kind='wall')
classmethod
#
Start a memory tracker.
PARAMETER | DESCRIPTION |
---|---|
memory_unit |
The unit of memory to use. |
time_kind |
The type of timer to use.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Profile
|
The Memory tracker. |
Source code in src/amltk/profiling/profiler.py
class Profiler
dataclass
#
Profile and record various events.
PARAMETER | DESCRIPTION |
---|---|
memory_unit |
The default unit of memory to use. |
time_kind |
The default type of timer to use.
TYPE:
|
def __getitem__(key)
#
def __iter__()
#
def __len__()
#
def disable()
#
def enable()
#
def each(itr, *, name, itr_name=None)
#
Profile each item in an iterable.
PARAMETER | DESCRIPTION |
---|---|
itr |
The iterable to profile.
TYPE:
|
name |
The name of the profile that lasts until iteration is complete
TYPE:
|
itr_name |
The name of the profile for each iteration.
If a function is provided, it will be called with each item's index
and the item. It should return a string. If |
YIELDS | DESCRIPTION |
---|---|
T
|
The the items |
Source code in src/amltk/profiling/profiler.py
def __call__(name, *, memory_unit=None, time_kind=None)
#
Profile a block of code. Store the result on this object.
PARAMETER | DESCRIPTION |
---|---|
name |
The name of the profile.
TYPE:
|
memory_unit |
The unit of memory to use. Overwrites the default.
TYPE:
|
time_kind |
The type of timer to use. Overwrites the default.
TYPE:
|
Source code in src/amltk/profiling/profiler.py
Source code in src/amltk/profiling/profiler.py
def measure(name, *, memory_unit=None, time_kind=None)
#
Profile a block of code. Store the result on this object.
PARAMETER | DESCRIPTION |
---|---|
name |
The name of the profile.
TYPE:
|
memory_unit |
The unit of memory to use. Overwrites the default.
TYPE:
|
time_kind |
The type of timer to use. Overwrites the default.
TYPE:
|