History#
Basic Usage#
The History class is used to store
Reports from Trials.
In it's most simple usage, you can simply add()
a Report as you recieve them and then use the df()
method to get a pandas.DataFrame of the history.
Reference History
from amltk.optimization import Trial, History, Metric
loss = Metric("loss", minimize=True)
def quadratic(x):
return x**2
history = History()
trials = [
Trial.create(name=f"trial_{count}", config={"x": i}, metrics=[loss])
for count, i in enumerate(range(-5, 5))
]
reports = []
for trial in trials:
x = trial.config["x"]
report = trial.success(loss=quadratic(x))
history.add(report)
print(history.df())
status trial_seed ... metric:loss (minimize) config:x
name ...
trial_0 success <NA> ... 25 -5
trial_1 success <NA> ... 16 -4
trial_2 success <NA> ... 9 -3
trial_3 success <NA> ... 4 -2
trial_4 success <NA> ... 1 -1
trial_5 success <NA> ... 0 0
trial_6 success <NA> ... 1 1
trial_7 success <NA> ... 4 2
trial_8 success <NA> ... 9 3
trial_9 success <NA> ... 16 4
[10 rows x 9 columns]
Typically, to use this inside of an optimization run, you would add the reports inside
of a callback from your Tasks. Please
see the optimization guide for more details.
With an Optimizer and Scheduler
from amltk.optimization import Trial, History, Metric
from amltk.scheduling import Scheduler
from amltk.pipeline import Searchable
searchable = Searchable("quad", space={"x": (-5, 5)})
n_workers = 2
def quadratic(x):
return x**2
def target_function(trial: Trial) -> Trial.Report:
x = trial.config["x"]
cost = quadratic(x)
return trial.success(cost=cost)
optimizer = SMACOptimizer(space=searchable, metrics=Metric("cost", minimize=True), seed=42)
scheduler = Scheduler.with_processes(2)
task = scheduler.task(quadratic)
@scheduler.on_start(repeat=n_workers)
def launch_trial():
trial = optimizer.ask()
task(trial)
@task.on_result
def add_to_history(report):
history.add(report)
@task.on_done
def launch_another(_):
trial = optimizer.ask()
task(trial)
scheduler.run(timeout=3)
Querying#
The History can be queried by either
an index or by the trial name.
Trial.Report(trial=Trial(name='trial_9', config={'x': 4}, bucket=PathBucket(PosixPath('trial-trial_9-2024-08-13T07:34:54.644221')), metrics=MetricCollection(metrics={'loss': Metric(name='loss', minimize=True, bounds=None, fn=None)}), created_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644220), seed=None, fidelities={}, summary={}, storage=set(), extras={}), status=<Status.SUCCESS: 'success'>, reported_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644401), exception=None, values={'loss': 16})
Trial.Report(trial=Trial(name='trial_9', config={'x': 4}, bucket=PathBucket(PosixPath('trial-trial_9-2024-08-13T07:34:54.644221')), metrics=MetricCollection(metrics={'loss': Metric(name='loss', minimize=True, bounds=None, fn=None)}), created_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644220), seed=None, fidelities={}, summary={}, storage=set(), extras={}), status=<Status.SUCCESS: 'success'>, reported_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644401), exception=None, values={'loss': 16})
Trial.Report(trial=Trial(name='trial_5', config={'x': 0}, bucket=PathBucket(PosixPath('trial-trial_5-2024-08-13T07:34:54.644019')), metrics=MetricCollection(metrics={'loss': Metric(name='loss', minimize=True, bounds=None, fn=None)}), created_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644018), seed=None, fidelities={}, summary={}, storage=set(), extras={}), status=<Status.SUCCESS: 'success'>, reported_at=datetime.datetime(2024, 8, 13, 7, 34, 54, 644370), exception=None, values={'loss': 0})
Filtering#
You can filter the history by using the filter()
method. This method takes a Callable[[Trial.Report], bool]
and returns a new History with only the
Reports that return True from the
given function.