Trials
Trial and Report#
Trial
- typically the output of
Optimizer.ask()
, indicating
what the optimizer would like to evaluate next.
e provide a host of convenience methods attached to the Trial
to make it easy to
save results, store artifacts, and more.
Trial.Report
-
the output of a trial.success(cost=...)
or
trial.fail(cost=...)
call.
Provides an easy way to report back to the optimizer's
tell()
.
Trial#
A Trial
encapsulates some configuration
that needs to be evaluated. Typically, this is what is generated by an
Optimizer.ask()
call.
-
trial.success()
to generate a successReport
, typically passing what your chosen optimizer expects, e.g.,"loss"
or"cost"
. -
trial.fail()
to generate a failureReport
. If an exception is passed tofail()
, it will be attached to the report along with any traceback it can deduce. EachOptimizer
will take care of what to do from here.
from amltk.optimization import Trial, Metric
from amltk.store import PathBucket
cost = Metric("cost", minimize=True)
def target_function(trial: Trial) -> Trial.Report:
x = trial.config["x"]
y = trial.config["y"]
with trial.profile("expensive-calculation"):
cost = x**2 - y
return trial.success(cost=cost)
# ... usually obtained from an optimizer
trial = Trial.create(
name="some-unique-name",
config={"x": 1, "y": 2},
metrics=[cost]
)
report = target_function(trial)
print(report.df())
What you can return with trial.success()
or trial.fail()
depends on the
metrics
of the trial. Typically,
an optimizer will provide the trial with the list of metrics
Some important properties are that they have a unique
.name
given the optimization run,
a candidate .config
to evaluate,
a possible .seed
to use,
and an .info
object, which is the optimizer
specific information, if required by you.
Reporting success (or failure)
When using the success()
method, make sure to provide values for all metrics specified in the
.metrics
attribute.
Usually these are set by the optimizer generating the Trial
.
If you instead report using fail()
,
any metric not specified will be set to the
.worst
value of the metric.
Each metric has a unique name, and it's crucial to use the correct names when reporting success, otherwise an error will occur.
Reporting success for metrics
For example:
from amltk.optimization import Trial, Metric
# Gotten from some optimizer usually, i.e. via `optimizer.ask()`
trial = Trial.create(
name="example_trial",
config={"param": 42},
metrics=[Metric(name="accuracy", minimize=False)]
)
# Incorrect usage (will raise an error)
try:
report = trial.success(invalid_metric=0.95)
except ValueError as error:
print(error)
# Correct usage
report = trial.success(accuracy=0.95)
If using Plugins
, they may insert
some extra objects in the .extra
dict.
To profile your trial, you can wrap the logic you'd like to check with
trial.profile()
, which will automatically
profile the block of code for memory before and after as well as time taken.
If you've profile()
'ed any intervals,
you can access them by name through
trial.profiles
.
Please see the Profiler
for more.
Profiling with a trial.
You can also record anything you'd like into the
.summary
, a plain dict
or use trial.store()
to store artifacts
related to the trial.
What to put in .summary
?
For large items, e.g. predictions or models, these are highly advised to
.store()
to disk, especially if using
a Task
for multiprocessing.
Further, if serializing the report using the
report.df()
,
returning a single row,
or a History
with history.df()
for a dataframe consisting
of many of the reports, then you'd likely only want to store things
that are scalar and can be serialised to disk by a pandas DataFrame.
Report#
The Trial.Report
encapsulates
a Trial
, its status and any metrics/exceptions
that may have occured.
Typically you will not create these yourself, but instead use
trial.success()
or
trial.fail()
to generate them.
from amltk.optimization import Trial, Metric
loss = Metric("loss", minimize=True)
trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss])
with trial.profile("fitting"):
# Do some work
# ...
report = trial.success(loss=1)
print(report.df())
These reports are used to report back metrics to an
Optimizer
with Optimizer.tell()
but can also be
stored for your own uses.
You can access the original trial with the
.trial
attribute, and the
Status
of the trial with the
.status
attribute.
You may also want to check out the History
class
for storing a collection of Report
s, allowing for an easier time to convert
them to a dataframe or perform some common Hyperparameter optimization parsing
of metrics.