Skip to content

Trials

Trial#

amltk.optimization.trial #

A Trial is typically the output of Optimizer.ask(), indicating what the optimizer would like to evaluate next. We provide a host of convenience methods attached to the Trial to make it easy to save results, store artifacts, and more.

Paired with the Trial is the Trial.Report, class, providing an easy way to report back to the optimizer's tell() with a simple trial.success(cost=...) or trial.fail(cost=...) call..

Trial#

amltk.optimization.trial.Trial dataclass #

Trial(
    *,
    name: str,
    config: Mapping[str, Any],
    bucket: PathBucket,
    info: I | None,
    metrics: MetricCollection,
    created_at: datetime,
    seed: int | None = None,
    fidelities: Mapping[str, Any],
    profiler: Profiler,
    summary: MutableMapping[str, Any],
    storage: set[Any],
    extras: MutableMapping[str, Any]
)

Bases: RichRenderable, Generic[I]

A Trial encapsulates some configuration that needs to be evaluated. Typically, this is what is generated by an Optimizer.ask() call.

Usage

If all went smooth, your trial was successful and you can use trial.success() to generate a success Report, typically passing what your chosen optimizer expects, e.g., "loss" or "cost".

If your trial failed, you can instead use the trial.fail() to generate a failure Report. If use pass in an exception to fail(), it will be attached to the report along with any traceback it can deduce. Each Optimizer will take care of what to do from here.

from amltk.optimization import Trial, Metric
from amltk.store import PathBucket

cost = Metric("cost", minimize=True)

def target_function(trial: Trial) -> Trial.Report:
    x = trial.config["x"]
    y = trial.config["y"]

    with trial.profile("expensive-calculation"):
        cost = x**2 - y

    return trial.success(cost=cost)

# ... usually obtained from an optimizer
trial = Trial.create(
    name="some-unique-name",
    config={"x": 1, "y": 2},
    metrics=[cost]
)

report = target_function(trial)
print(report.df())

status ... profile:expensive-calculation:time:unit name ... some-unique-name success ... seconds [1 rows x 22 columns]

What you can return with trial.success() or trial.fail() depends on the metrics of the trial. Typically, an optimizer will provide the trial with the list of metrics.

Metrics

amltk.optimization.metric.Metric dataclass #

Metric(
    name: str,
    *,
    minimize: bool = True,
    bounds: tuple[float, float] | None = None,
    fn: Callable[P, float] | None = None
)

Bases: Generic[P]

A metric with a given name, optimal direction, and possible bounds.

Some important properties are that they have a unique .name given the optimization run, a candidate .config to evaluate, a possible .seed to use, and an .info object, which is the optimizer specific information, if required by you.

Reporting success (or failure)

When using the success() method, make sure to provide values for all metrics specified in the .metrics attribute. Usually these are set by the optimizer generating the Trial.

If you instead report using fail(), any metric not specified will be set to the .worst value of the metric.

Each metric has a unique name, and it's crucial to use the correct names when reporting success, otherwise an error will occur.

Reporting success for metrics

For example:

from amltk.optimization import Trial, Metric

# Gotten from some optimizer usually, i.e. via `optimizer.ask()`
trial = Trial.create(
    name="example_trial",
    config={"param": 42},
    metrics=[Metric(name="accuracy", minimize=False)]
)

# Incorrect usage (will raise an error)
try:
    report = trial.success(invalid_metric=0.95)
except ValueError as error:
    print(error)

# Correct usage
report = trial.success(accuracy=0.95)
 Please provide a value for the metric 'accuracy' as  this is one of the metrics of the trial. 
 Try `trial.success(accuracy=value, ...)`.

If using Plugins, they may insert some extra objects in the .extra dict.

To profile your trial, you can wrap the logic you'd like to check with trial.profile(), which will automatically profile the block of code for memory before and after as well as time taken.

If you've profile()'ed any intervals, you can access them by name through trial.profiles. Please see the Profiler for more.

Profiling with a trial.

profile
from amltk.optimization import Trial

trial = Trial.create(name="some-unique-name", config={})

# ... somewhere where you've begun your trial.
with trial.profile("some_interval"):
    for work in range(100):
        pass

print(trial.profiler.df())
               memory:start_vms  memory:end_vms  ...  time:kind  time:unit
some_interval      2.078061e+09      2078060544  ...       wall    seconds

[1 rows x 12 columns]

You can also record anything you'd like into the .summary, a plain dict or use trial.store() to store artifacts related to the trial.

What to put in .summary?

For large items, e.g. predictions or models, these are highly advised to .store() to disk, especially if using a Task for multiprocessing.

Further, if serializing the report using the report.df(), returning a single row, or a History with history.df() for a dataframe consisting of many of the reports, then you'd likely only want to store things that are scalar and can be serialised to disk by a pandas DataFrame.

_repr_html_ #

_repr_html_() -> str

Return an HTML representation of the object.

Source code in src/amltk/_richutil/renderable.py
def _repr_html_(self) -> str:
    """Return an HTML representation of the object."""
    return self._repr_pretty_()

_repr_pretty_ #

_repr_pretty_(*_: Any, **__: Any) -> str

Representation for rich printing.

Source code in src/amltk/_richutil/renderable.py
def _repr_pretty_(self, *_: Any, **__: Any) -> str:
    """Representation for rich printing."""
    from io import StringIO

    import rich

    with closing(StringIO()) as buffer:
        rich.print(self.__rich__(), file=buffer)
        return buffer.getvalue()
Report#

amltk.optimization.trial.Trial.Report dataclass #

Report(
    trial: Trial[I2],
    status: Status,
    reported_at: datetime = datetime.now(),
    exception: BaseException | None = None,
    traceback: str | None = None,
    values: Mapping[str, float] = dict(),
)

Bases: RichRenderable, Generic[I2]

The Trial.Report encapsulates a Trial, its status and any metrics/exceptions that may have occured.

Typically you will not create these yourself, but instead use trial.success() or trial.fail() to generate them.

from amltk.optimization import Trial, Metric

loss = Metric("loss", minimize=True)

trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss])

with trial.profile("fitting"):
    # Do some work
    # ...
    report = trial.success(loss=1)

print(report.df())
        status  trial_seed  ... profile:fitting:time:kind profile:fitting:time:unit
name                        ...                                                    
trial  success        <NA>  ...                      wall                   seconds

[1 rows x 21 columns]

These reports are used to report back metrics to an Optimizer with Optimizer.tell() but can also be stored for your own uses.

You can access the original trial with the .trial attribute, and the Status of the trial with the .status attribute.

You may also want to check out the History class for storing a collection of Reports, allowing for an easier time to convert them to a dataframe or perform some common Hyperparameter optimization parsing of metrics.

_repr_html_ #

_repr_html_() -> str

Return an HTML representation of the object.

Source code in src/amltk/_richutil/renderable.py
def _repr_html_(self) -> str:
    """Return an HTML representation of the object."""
    return self._repr_pretty_()

_repr_pretty_ #

_repr_pretty_(*_: Any, **__: Any) -> str

Representation for rich printing.

Source code in src/amltk/_richutil/renderable.py
def _repr_pretty_(self, *_: Any, **__: Any) -> str:
    """Representation for rich printing."""
    from io import StringIO

    import rich

    with closing(StringIO()) as buffer:
        rich.print(self.__rich__(), file=buffer)
        return buffer.getvalue()

Trial dataclass #

Trial(
    *,
    name: str,
    config: Mapping[str, Any],
    bucket: PathBucket,
    info: I | None,
    metrics: MetricCollection,
    created_at: datetime,
    seed: int | None = None,
    fidelities: Mapping[str, Any],
    profiler: Profiler,
    summary: MutableMapping[str, Any],
    storage: set[Any],
    extras: MutableMapping[str, Any]
)

Bases: RichRenderable, Generic[I]

A Trial encapsulates some configuration that needs to be evaluated. Typically, this is what is generated by an Optimizer.ask() call.

Usage

If all went smooth, your trial was successful and you can use trial.success() to generate a success Report, typically passing what your chosen optimizer expects, e.g., "loss" or "cost".

If your trial failed, you can instead use the trial.fail() to generate a failure Report. If use pass in an exception to fail(), it will be attached to the report along with any traceback it can deduce. Each Optimizer will take care of what to do from here.

from amltk.optimization import Trial, Metric
from amltk.store import PathBucket

cost = Metric("cost", minimize=True)

def target_function(trial: Trial) -> Trial.Report:
    x = trial.config["x"]
    y = trial.config["y"]

    with trial.profile("expensive-calculation"):
        cost = x**2 - y

    return trial.success(cost=cost)

# ... usually obtained from an optimizer
trial = Trial.create(
    name="some-unique-name",
    config={"x": 1, "y": 2},
    metrics=[cost]
)

report = target_function(trial)
print(report.df())

status ... profile:expensive-calculation:time:unit name ... some-unique-name success ... seconds [1 rows x 22 columns]

What you can return with trial.success() or trial.fail() depends on the metrics of the trial. Typically, an optimizer will provide the trial with the list of metrics.

Metrics

amltk.optimization.metric.Metric dataclass #

Metric(
    name: str,
    *,
    minimize: bool = True,
    bounds: tuple[float, float] | None = None,
    fn: Callable[P, float] | None = None
)

Bases: Generic[P]

A metric with a given name, optimal direction, and possible bounds.

Some important properties are that they have a unique .name given the optimization run, a candidate .config to evaluate, a possible .seed to use, and an .info object, which is the optimizer specific information, if required by you.

Reporting success (or failure)

When using the success() method, make sure to provide values for all metrics specified in the .metrics attribute. Usually these are set by the optimizer generating the Trial.

If you instead report using fail(), any metric not specified will be set to the .worst value of the metric.

Each metric has a unique name, and it's crucial to use the correct names when reporting success, otherwise an error will occur.

Reporting success for metrics

For example:

from amltk.optimization import Trial, Metric

# Gotten from some optimizer usually, i.e. via `optimizer.ask()`
trial = Trial.create(
    name="example_trial",
    config={"param": 42},
    metrics=[Metric(name="accuracy", minimize=False)]
)

# Incorrect usage (will raise an error)
try:
    report = trial.success(invalid_metric=0.95)
except ValueError as error:
    print(error)

# Correct usage
report = trial.success(accuracy=0.95)
 Please provide a value for the metric 'accuracy' as  this is one of the metrics of the trial. 
 Try `trial.success(accuracy=value, ...)`.

If using Plugins, they may insert some extra objects in the .extra dict.

To profile your trial, you can wrap the logic you'd like to check with trial.profile(), which will automatically profile the block of code for memory before and after as well as time taken.

If you've profile()'ed any intervals, you can access them by name through trial.profiles. Please see the Profiler for more.

Profiling with a trial.

profile
from amltk.optimization import Trial

trial = Trial.create(name="some-unique-name", config={})

# ... somewhere where you've begun your trial.
with trial.profile("some_interval"):
    for work in range(100):
        pass

print(trial.profiler.df())
               memory:start_vms  memory:end_vms  ...  time:kind  time:unit
some_interval      2.078061e+09      2078060544  ...       wall    seconds

[1 rows x 12 columns]

You can also record anything you'd like into the .summary, a plain dict or use trial.store() to store artifacts related to the trial.

What to put in .summary?

For large items, e.g. predictions or models, these are highly advised to .store() to disk, especially if using a Task for multiprocessing.

Further, if serializing the report using the report.df(), returning a single row, or a History with history.df() for a dataframe consisting of many of the reports, then you'd likely only want to store things that are scalar and can be serialised to disk by a pandas DataFrame.

bucket instance-attribute #

bucket: PathBucket

The bucket to store trial related output to.

config instance-attribute #

config: Mapping[str, Any]

The config of the trial provided by the optimizer.

created_at instance-attribute #

created_at: datetime

When the trial was created.

extras instance-attribute #

Any extras attached to the trial.

fidelities instance-attribute #

fidelities: Mapping[str, Any]

The fidelities at which to evaluate the trial, if any.

info class-attribute instance-attribute #

info: I | None = field(repr=False)

The info of the trial provided by the optimizer.

metrics instance-attribute #

The metrics associated with the trial.

You can access the metrics by name, e.g. trial.metrics["loss"].

name instance-attribute #

name: str

The unique name of the trial.

profiler class-attribute instance-attribute #

profiler: Profiler = field(repr=False)

A profiler for this trial.

profiles property #

profiles: Mapping[str, Interval]

The profiles of the trial.

These are indexed by the name of the profile indicated by:

with trial.profile("key_to_index"):
    # ...

profile = trial.profiles["key_to_index"]

The values are a Profile.Interval, which contain a Memory.Interval and a Timer.Interval. Please see the respective documentation for more.

seed class-attribute instance-attribute #

seed: int | None = None

The seed to use if suggested by the optimizer.

storage instance-attribute #

storage: set[Any]

Anything stored in the trial, the elements of the list are keys that can be used to retrieve them later, such as a Path.

summary instance-attribute #

summary: MutableMapping[str, Any]

The summary of the trial. These are for summary statistics of a trial and are single values.

Report dataclass #

Report(
    trial: Trial[I2],
    status: Status,
    reported_at: datetime = datetime.now(),
    exception: BaseException | None = None,
    traceback: str | None = None,
    values: Mapping[str, float] = dict(),
)

Bases: RichRenderable, Generic[I2]

The Trial.Report encapsulates a Trial, its status and any metrics/exceptions that may have occured.

Typically you will not create these yourself, but instead use trial.success() or trial.fail() to generate them.

from amltk.optimization import Trial, Metric

loss = Metric("loss", minimize=True)

trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss])

with trial.profile("fitting"):
    # Do some work
    # ...
    report = trial.success(loss=1)

print(report.df())
        status  trial_seed  ... profile:fitting:time:kind profile:fitting:time:unit
name                        ...                                                    
trial  success        <NA>  ...                      wall                   seconds

[1 rows x 21 columns]

These reports are used to report back metrics to an Optimizer with Optimizer.tell() but can also be stored for your own uses.

You can access the original trial with the .trial attribute, and the Status of the trial with the .status attribute.

You may also want to check out the History class for storing a collection of Reports, allowing for an easier time to convert them to a dataframe or perform some common Hyperparameter optimization parsing of metrics.

bucket property #
bucket: PathBucket

The bucket attached to the trial.

config property #
config: Mapping[str, Any]

The config of the trial.

exception class-attribute instance-attribute #
exception: BaseException | None = None

The exception reported if any.

info property #
info: I2 | None

The info of the trial, specific to the optimizer that issued it.

metrics property #

The metrics of the trial.

name property #
name: str

The name of the trial.

profiles property #
profiles: Mapping[str, Interval]

The profiles of the trial.

reported_at class-attribute instance-attribute #
reported_at: datetime = field(default_factory=now)

When this Report was generated.

This will primarily be None if there was no corresponding key when loading this report from a serialized form, such as with from_df() or from_dict().

status instance-attribute #
status: Status

The status of the trial.

storage property #
storage: set[str]

The storage of the trial.

summary property #
summary: MutableMapping[str, Any]

The summary of the trial.

traceback class-attribute instance-attribute #
traceback: str | None = field(repr=False, default=None)

The traceback reported if any.

trial instance-attribute #
trial: Trial[I2]

The trial that was run.

values class-attribute instance-attribute #
values: Mapping[str, float] = field(default_factory=dict)

The reported metric values of the trial.

df #
df(
    *,
    profiles: bool = True,
    configs: bool = True,
    summary: bool = True,
    metrics: bool = True
) -> DataFrame

Get a dataframe of the trial.

Prefixes

  • summary: Entries will be prefixed with "summary:"
  • config: Entries will be prefixed with "config:"
  • storage: Entries will be prefixed with "storage:"
  • metrics: Entries will be prefixed with "metrics:"
  • profile:<name>: Entries will be prefixed with "profile:<name>:"
PARAMETER DESCRIPTION
profiles

Whether to include the profiles.

TYPE: bool DEFAULT: True

configs

Whether to include the configs.

TYPE: bool DEFAULT: True

summary

Whether to include the summary.

TYPE: bool DEFAULT: True

metrics

Whether to include the metrics.

TYPE: bool DEFAULT: True

Source code in src/amltk/optimization/trial.py
def df(
    self,
    *,
    profiles: bool = True,
    configs: bool = True,
    summary: bool = True,
    metrics: bool = True,
) -> pd.DataFrame:
    """Get a dataframe of the trial.

    !!! note "Prefixes"

        * `summary`: Entries will be prefixed with `#!python "summary:"`
        * `config`: Entries will be prefixed with `#!python "config:"`
        * `storage`: Entries will be prefixed with `#!python "storage:"`
        * `metrics`: Entries will be prefixed with `#!python "metrics:"`
        * `profile:<name>`: Entries will be prefixed with
            `#!python "profile:<name>:"`

    Args:
        profiles: Whether to include the profiles.
        configs: Whether to include the configs.
        summary: Whether to include the summary.
        metrics: Whether to include the metrics.
    """
    items = {
        "name": self.name,
        "status": str(self.status),
        "trial_seed": self.trial.seed if self.trial.seed else np.nan,
        "exception": str(self.exception) if self.exception else "NA",
        "traceback": str(self.traceback) if self.traceback else "NA",
        "bucket": str(self.bucket.path),
        "created_at": self.trial.created_at,
        "reported_at": self.reported_at,
    }
    if metrics:
        for metric_name, value in self.values.items():
            metric_def = self.metrics[metric_name]
            items[f"metric:{metric_def}"] = value
    if summary:
        items.update(**prefix_keys(self.trial.summary, "summary:"))
    if configs:
        items.update(**prefix_keys(self.trial.config, "config:"))
    if profiles:
        for name, profile in sorted(self.profiles.items(), key=lambda x: x[0]):
            items.update(profile.to_dict(prefix=f"profile:{name}"))

    return pd.DataFrame(items, index=[0]).convert_dtypes().set_index("name")
from_df classmethod #
from_df(df: DataFrame | Series) -> Report

Create a report from a dataframe.

See Also
Source code in src/amltk/optimization/trial.py
@classmethod
def from_df(cls, df: pd.DataFrame | pd.Series) -> Trial.Report:
    """Create a report from a dataframe.

    See Also:
        * [`.from_dict()`][amltk.optimization.Trial.Report.from_dict]
    """
    if isinstance(df, pd.DataFrame):
        if len(df) != 1:
            raise ValueError(
                f"Expected a dataframe with one row, got {len(df)} rows.",
            )
        series = df.iloc[0]
    else:
        series = df

    data_dict = {"name": series.name, **series.to_dict()}
    return cls.from_dict(data_dict)
from_dict classmethod #
from_dict(d: Mapping[str, Any]) -> Report

Create a report from a dictionary.

Prefixes

Please see .df() for information on what the prefixes should be for certain fields.

PARAMETER DESCRIPTION
d

The dictionary to create the report from.

TYPE: Mapping[str, Any]

RETURNS DESCRIPTION
Report

The created report.

Source code in src/amltk/optimization/trial.py
@classmethod
def from_dict(cls, d: Mapping[str, Any]) -> Trial.Report:
    """Create a report from a dictionary.

    !!! note "Prefixes"

        Please see [`.df()`][amltk.optimization.Trial.Report.df]
        for information on what the prefixes should be for certain fields.

    Args:
        d: The dictionary to create the report from.

    Returns:
        The created report.
    """
    prof_dict = mapping_select(d, "profile:")
    if any(prof_dict):
        profile_names = sorted(
            {name.rsplit(":", maxsplit=2)[0] for name in prof_dict},
        )
        profiles = {
            name: Profile.from_dict(mapping_select(prof_dict, f"{name}:"))
            for name in profile_names
        }
    else:
        profiles = {}

    # NOTE: We assume the order of the objectives are in the right
    # order in the dict. If we attempt to force a sort-order, we may
    # deserialize incorrectly. By not having a sort order, we rely
    # on serialization to keep the order, which is not ideal either.
    # May revisit this if we need to
    raw_metrics: dict[str, float] = mapping_select(d, "metric:")
    metrics: dict[Metric, float | None] = {
        Metric.from_str(name): value for name, value in raw_metrics.items()
    }

    exception = d.get("exception")
    traceback = d.get("traceback")
    trial_seed = d.get("trial_seed")
    if pd.isna(exception) or exception == "NA":  # type: ignore
        exception = None
    if pd.isna(traceback) or traceback == "NA":  # type: ignore
        traceback = None
    if pd.isna(trial_seed):  # type: ignore
        trial_seed = None

    if (_bucket := d.get("bucket")) is not None:
        bucket = PathBucket(_bucket)
    else:
        bucket = PathBucket(f"uknown_trial_bucket-{datetime.now().isoformat()}")

    created_at_timestamp = d.get("created_at")
    if created_at_timestamp is None:
        raise ValueError(
            "Cannot load report from dict without a 'created_at' field.",
        )
    created_at = parse_timestamp_object(created_at_timestamp)

    trial: Trial = Trial.create(
        name=d["name"],
        config=mapping_select(d, "config:"),
        info=None,  # We don't save this to disk so we load it back as None
        bucket=bucket,
        seed=trial_seed,
        fidelities=mapping_select(d, "fidelities:"),
        profiler=Profiler(profiles=profiles),
        metrics=metrics.keys(),
        created_at=created_at,
        summary=mapping_select(d, "summary:"),
        storage=set(mapping_select(d, "storage:").values()),
        extras=mapping_select(d, "extras:"),
    )
    _values: dict[str, float] = {
        m.name: v
        for m, v in metrics.items()
        if (v is not None and not pd.isna(v))
    }

    status = Trial.Status(dict_get_not_none(d, "status", "unknown"))
    match status:
        case Trial.Status.SUCCESS:
            report = trial.success(**_values)
        case Trial.Status.FAIL:
            exc = Exception(exception) if exception else None
            tb = str(traceback) if traceback else None
            report = trial.fail(exc, tb, **_values)
        case Trial.Status.CRASHED:
            exc = Exception(exception) if exception else Exception("Unknown")
            tb = str(traceback) if traceback else None
            report = trial.crashed(exc, tb)
        case Trial.Status.UNKNOWN | _:
            report = trial.crashed(exception=Exception("Unknown status."))

    timestamp = d.get("reported_at")
    if timestamp is None:
        raise ValueError(
            "Cannot load report from dict without a 'reported_at' field.",
        )
    report.reported_at = parse_timestamp_object(timestamp)

    return report
retrieve #
retrieve(
    key: str, *, check: type[R] | None = None
) -> R | Any

Retrieve items related to the trial.

retrieve-bucket
from amltk.optimization import Trial
from amltk.store import PathBucket

bucket = PathBucket("results")

trial = Trial.create(name="trial", config={"x": 1}, bucket=bucket)

trial.store({"config.json": trial.config})
report = trial.success()

config = report.retrieve("config.json")
print(config)
{'x': 1}
PARAMETER DESCRIPTION
key

The key of the item to retrieve as said in .storage.

TYPE: str

check

If provided, will check that the retrieved item is of the provided type. If not, will raise a TypeError.

TYPE: type[R] | None DEFAULT: None

RETURNS DESCRIPTION
R | Any

The retrieved item.

RAISES DESCRIPTION
TypeError

If check= is provided and the retrieved item is not of the provided type.

Source code in src/amltk/optimization/trial.py
def retrieve(self, key: str, *, check: type[R] | None = None) -> R | Any:
    """Retrieve items related to the trial.

    ```python exec="true" source="material-block" result="python" title="retrieve-bucket" hl_lines="11"

    from amltk.optimization import Trial
    from amltk.store import PathBucket

    bucket = PathBucket("results")

    trial = Trial.create(name="trial", config={"x": 1}, bucket=bucket)

    trial.store({"config.json": trial.config})
    report = trial.success()

    config = report.retrieve("config.json")
    print(config)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        key: The key of the item to retrieve as said in `.storage`.
        check: If provided, will check that the retrieved item is of the
            provided type. If not, will raise a `TypeError`.

    Returns:
        The retrieved item.

    Raises:
        TypeError: If `check=` is provided and  the retrieved item is not of the provided
            type.
    """  # noqa: E501
    return self.trial.retrieve(key, check=check)
rich_renderables #
rich_renderables() -> Iterable[RenderableType]

The renderables for rich for this report.

Source code in src/amltk/optimization/trial.py
def rich_renderables(self) -> Iterable[RenderableType]:
    """The renderables for rich for this report."""
    from rich.pretty import Pretty
    from rich.text import Text

    yield Text.assemble(
        ("Status", "bold"),
        ("(", "default"),
        self.status.__rich__(),
        (")", "default"),
    )
    yield Pretty(self.metrics)
    yield from self.trial.rich_renderables()
store #
store(items: Mapping[str, T]) -> None

Store items related to the trial.

See Also
Source code in src/amltk/optimization/trial.py
def store(self, items: Mapping[str, T]) -> None:
    """Store items related to the trial.

    See Also:
        * [`Trial.store()`][amltk.optimization.trial.Trial.store]
    """
    self.trial.store(items)

Status #

Bases: str, Enum

The status of a trial.

CRASHED class-attribute instance-attribute #
CRASHED = 'crashed'

The trial crashed.

FAIL class-attribute instance-attribute #
FAIL = 'fail'

The trial failed.

SUCCESS class-attribute instance-attribute #
SUCCESS = 'success'

The trial was successful.

UNKNOWN class-attribute instance-attribute #
UNKNOWN = 'unknown'

The status of the trial is unknown.

attach_extra #

attach_extra(name: str, plugin_item: Any) -> None

Attach a plugin item to the trial.

PARAMETER DESCRIPTION
name

The name of the plugin item.

TYPE: str

plugin_item

The plugin item.

TYPE: Any

Source code in src/amltk/optimization/trial.py
def attach_extra(self, name: str, plugin_item: Any) -> None:
    """Attach a plugin item to the trial.

    Args:
        name: The name of the plugin item.
        plugin_item: The plugin item.
    """
    self.extras[name] = plugin_item

copy #

copy() -> Self

Create a copy of the trial.

RETURNS DESCRIPTION
Self

The copy of the trial.

Source code in src/amltk/optimization/trial.py
def copy(self) -> Self:
    """Create a copy of the trial.

    Returns:
        The copy of the trial.
    """
    return copy.deepcopy(self)

crashed #

crashed(
    exception: Exception, traceback: str | None = None
) -> Report[I]

Generate a crash report.

Note

You will typically not create these manually, but instead if we don't recieve a report from a target function evaluation, but only an error, we assume something crashed and generate a crash report for you.

Non specifed metrics

We will use the .metrics to determine the .worst value of the metric, using that as the reported metrics

PARAMETER DESCRIPTION
exception

The exception that caused the crash. If not provided, the exception will be taken from the trial. If this is still None, a RuntimeError will be raised.

TYPE: Exception

traceback

The traceback of the exception. If not provided, the traceback will be taken from the trial if there is one there.

TYPE: str | None DEFAULT: None

RETURNS DESCRIPTION
Report[I]

The report of the trial.

Source code in src/amltk/optimization/trial.py
def crashed(
    self,
    exception: Exception,
    traceback: str | None = None,
) -> Trial.Report[I]:
    """Generate a crash report.

    !!! note

        You will typically not create these manually, but instead if we don't
        recieve a report from a target function evaluation, but only an error,
        we assume something crashed and generate a crash report for you.

    !!! note "Non specifed metrics"

        We will use the [`.metrics`][amltk.optimization.Trial.metrics] to determine
        the [`.worst`][amltk.optimization.Metric.worst] value of the metric,
        using that as the reported metrics

    Args:
        exception: The exception that caused the crash. If not provided, the
            exception will be taken from the trial. If this is still `None`,
            a `RuntimeError` will be raised.
        traceback: The traceback of the exception. If not provided, the
            traceback will be taken from the trial if there is one there.

    Returns:
        The report of the trial.
    """
    if traceback is None:
        traceback = "".join(traceback_module.format_tb(exception.__traceback__))

    return Trial.Report(
        trial=self,
        status=Trial.Status.CRASHED,
        exception=exception,
        traceback=traceback,
    )

create classmethod #

create(
    name: str,
    config: Mapping[str, Any] | None = None,
    *,
    metrics: (
        Metric
        | Iterable[Metric]
        | Mapping[str, Metric]
        | None
    ) = None,
    info: I | None = None,
    seed: int | None = None,
    fidelities: Mapping[str, Any] | None = None,
    created_at: datetime | None = None,
    profiler: Profiler | None = None,
    bucket: str | Path | PathBucket | None = None,
    summary: MutableMapping[str, Any] | None = None,
    storage: set[Hashable] | None = None,
    extras: MutableMapping[str, Any] | None = None
) -> Trial[I]

Create a trial.

PARAMETER DESCRIPTION
name

The name of the trial.

TYPE: str

metrics

The metrics of the trial.

TYPE: Metric | Iterable[Metric] | Mapping[str, Metric] | None DEFAULT: None

config

The config of the trial.

TYPE: Mapping[str, Any] | None DEFAULT: None

info

The info of the trial.

TYPE: I | None DEFAULT: None

seed

The seed of the trial.

TYPE: int | None DEFAULT: None

fidelities

The fidelities of the trial.

TYPE: Mapping[str, Any] | None DEFAULT: None

bucket

The bucket of the trial.

TYPE: str | Path | PathBucket | None DEFAULT: None

created_at

When the trial was created.

TYPE: datetime | None DEFAULT: None

profiler

The profiler of the trial.

TYPE: Profiler | None DEFAULT: None

summary

The summary of the trial.

TYPE: MutableMapping[str, Any] | None DEFAULT: None

storage

The storage of the trial.

TYPE: set[Hashable] | None DEFAULT: None

extras

The extras of the trial.

TYPE: MutableMapping[str, Any] | None DEFAULT: None

RETURNS DESCRIPTION
Trial[I]

The trial.

Source code in src/amltk/optimization/trial.py
@classmethod
def create(  # noqa: PLR0913
    cls,
    name: str,
    config: Mapping[str, Any] | None = None,
    *,
    metrics: Metric | Iterable[Metric] | Mapping[str, Metric] | None = None,
    info: I | None = None,
    seed: int | None = None,
    fidelities: Mapping[str, Any] | None = None,
    created_at: datetime | None = None,
    profiler: Profiler | None = None,
    bucket: str | Path | PathBucket | None = None,
    summary: MutableMapping[str, Any] | None = None,
    storage: set[Hashable] | None = None,
    extras: MutableMapping[str, Any] | None = None,
) -> Trial[I]:
    """Create a trial.

    Args:
        name: The name of the trial.
        metrics: The metrics of the trial.
        config: The config of the trial.
        info: The info of the trial.
        seed: The seed of the trial.
        fidelities: The fidelities of the trial.
        bucket: The bucket of the trial.
        created_at: When the trial was created.
        profiler: The profiler of the trial.
        summary: The summary of the trial.
        storage: The storage of the trial.
        extras: The extras of the trial.

    Returns:
        The trial.
    """
    return Trial(
        name=name,
        metrics=(
            MetricCollection.from_collection(metrics)
            if metrics is not None
            else MetricCollection()
        ),
        profiler=(
            profiler
            if profiler is not None
            else Profiler(memory_unit="B", time_kind="wall")
        ),
        config=config if config is not None else {},
        info=info,
        seed=seed,
        created_at=created_at if created_at is not None else datetime.now(),
        fidelities=fidelities if fidelities is not None else {},
        bucket=(
            bucket
            if isinstance(bucket, PathBucket)
            else (
                PathBucket(bucket)
                if bucket is not None
                else PathBucket(f"trial-{name}-{datetime.now().isoformat()}")
            )
        ),
        summary=summary if summary is not None else {},
        storage=storage if storage is not None else set(),
        extras=extras if extras is not None else {},
    )

delete_from_storage #

delete_from_storage(
    items: Iterable[str],
) -> dict[str, bool]

Delete items related to the trial.

delete-storage
from amltk.optimization import Trial
from amltk.store import PathBucket

bucket = PathBucket("results")
trial = Trial.create(name="trial", config={"x": 1}, info={}, bucket=bucket)

trial.store({"config.json": trial.config})
trial.delete_from_storage(items=["config.json"])

print(trial.storage)
set()
PARAMETER DESCRIPTION
items

The items to delete, an iterable of keys

TYPE: Iterable[str]

RETURNS DESCRIPTION
dict[str, bool]

A dict from the key to whether it was deleted or not.

Source code in src/amltk/optimization/trial.py
def delete_from_storage(self, items: Iterable[str]) -> dict[str, bool]:
    """Delete items related to the trial.

    ```python exec="true" source="material-block" result="python" title="delete-storage" hl_lines="6"
    from amltk.optimization import Trial
    from amltk.store import PathBucket

    bucket = PathBucket("results")
    trial = Trial.create(name="trial", config={"x": 1}, info={}, bucket=bucket)

    trial.store({"config.json": trial.config})
    trial.delete_from_storage(items=["config.json"])

    print(trial.storage)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        items: The items to delete, an iterable of keys

    Returns:
        A dict from the key to whether it was deleted or not.
    """  # noqa: E501
    # If not a Callable, we convert to a path bucket
    removed = self.bucket.remove(items)
    self.storage.difference_update(items)
    return removed

dump_exception #

dump_exception(
    exception: BaseException, *, name: str | None = None
) -> None

Dump an exception to the trial.

PARAMETER DESCRIPTION
exception

The exception to dump.

TYPE: BaseException

name

The name of the file to dump to. If None, will be "exception".

TYPE: str | None DEFAULT: None

Source code in src/amltk/optimization/trial.py
def dump_exception(
    self,
    exception: BaseException,
    *,
    name: str | None = None,
) -> None:
    """Dump an exception to the trial.

    Args:
        exception: The exception to dump.
        name: The name of the file to dump to. If `None`, will be `"exception"`.
    """
    fname = name if name is not None else "exception"
    traceback = "".join(traceback_module.format_tb(exception.__traceback__))
    msg = f"{traceback}\n{exception.__class__.__name__}: {exception}"
    self.store({f"{fname}.txt": msg})

fail #

fail(
    exception: Exception | None = None,
    traceback: str | None = None,
    /,
    **metrics: float | int,
) -> Report[I]

Generate a failure report.

Non specifed metrics

If you do not specify metrics, this will use the .metrics to determine the .worst value of the metric, using that as the reported result

fail
from amltk.optimization import Trial, Metric

loss = Metric("loss", minimize=True, bounds=(0, 1_000))
trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss])

try:
    raise ValueError("This is an error")  # Something went wrong
except Exception as error:
    report = trial.fail(error)

print(report.values)
print(report)
{}
Trial.Report(trial=Trial(name='trial', config={'x': 1}, bucket=PathBucket(PosixPath('trial-trial-2024-04-24T14:12:08.904852')), metrics=MetricCollection(metrics={'loss': Metric(name='loss', minimize=True, bounds=(0.0, 1000.0), fn=None)}), created_at=datetime.datetime(2024, 4, 24, 14, 12, 8, 904849), seed=None, fidelities={}, summary={}, storage=set(), extras={}), status=<Status.FAIL: 'fail'>, reported_at=datetime.datetime(2024, 4, 24, 14, 12, 8, 905024), exception=ValueError('This is an error'), values={})
RETURNS DESCRIPTION
Report[I]

The result of the trial.

Source code in src/amltk/optimization/trial.py
def fail(
    self,
    exception: Exception | None = None,
    traceback: str | None = None,
    /,
    **metrics: float | int,
) -> Trial.Report[I]:
    """Generate a failure report.

    !!! note "Non specifed metrics"

        If you do not specify metrics, this will use
        the [`.metrics`][amltk.optimization.Trial.metrics] to determine
        the [`.worst`][amltk.optimization.Metric.worst] value of the metric,
        using that as the reported result

    ```python exec="true" source="material-block" result="python" title="fail"
    from amltk.optimization import Trial, Metric

    loss = Metric("loss", minimize=True, bounds=(0, 1_000))
    trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss])

    try:
        raise ValueError("This is an error")  # Something went wrong
    except Exception as error:
        report = trial.fail(error)

    print(report.values)
    print(report)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Returns:
        The result of the trial.
    """
    if exception is not None and traceback is None:
        traceback = traceback_module.format_exc()

    # Need to check if anything extra was reported!
    extra = set(metrics.keys()) - self.metrics.keys()
    if extra:
        raise ValueError(
            f"Cannot report `fail()` with extra metrics: {extra=}."
            f"\nOnly metrics {list(self.metrics)} as these are the metrics"
            " provided for this trial."
            "\nTo record other numerics, use `trial.summary` instead.",
        )

    return Trial.Report(
        trial=self,
        status=Trial.Status.FAIL,
        exception=exception,
        traceback=traceback,
        values=metrics,
    )

profile #

profile(
    name: str,
    *,
    time: (
        Kind | Literal["wall", "cpu", "process"] | None
    ) = None,
    memory_unit: (
        Unit | Literal["B", "KB", "MB", "GB"] | None
    ) = None,
    summary: bool = False
) -> Iterator[None]

Measure some interval in the trial.

The results of the profiling will be available in the .summary attribute with the name of the interval as the key.

profile
from amltk.optimization import Trial
import time

trial = Trial.create(name="trial", config={"x": 1})

with trial.profile("some_interval"):
    # Do some work
    time.sleep(1)

print(trial.profiler["some_interval"].time)
Timer.Interval(start=1713967928.9224126, end=1713967929.923518, kind=wall, unit=seconds)
PARAMETER DESCRIPTION
name

The name of the interval.

TYPE: str

time

The timer kind to use for the trial. Defaults to the default timer kind of the profiler.

TYPE: Kind | Literal['wall', 'cpu', 'process'] | None DEFAULT: None

memory_unit

The memory unit to use for the trial. Defaults to the default memory unit of the profiler.

TYPE: Unit | Literal['B', 'KB', 'MB', 'GB'] | None DEFAULT: None

summary

Whether to add the interval to the summary.

TYPE: bool DEFAULT: False

YIELDS DESCRIPTION
Iterator[None]

The interval measured. Values will be nan until the with block is finished.

Source code in src/amltk/optimization/trial.py
@contextmanager
def profile(
    self,
    name: str,
    *,
    time: Timer.Kind | Literal["wall", "cpu", "process"] | None = None,
    memory_unit: Memory.Unit | Literal["B", "KB", "MB", "GB"] | None = None,
    summary: bool = False,
) -> Iterator[None]:
    """Measure some interval in the trial.

    The results of the profiling will be available in the `.summary` attribute
    with the name of the interval as the key.

    ```python exec="true" source="material-block" result="python" title="profile"
    from amltk.optimization import Trial
    import time

    trial = Trial.create(name="trial", config={"x": 1})

    with trial.profile("some_interval"):
        # Do some work
        time.sleep(1)

    print(trial.profiler["some_interval"].time)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        name: The name of the interval.
        time: The timer kind to use for the trial. Defaults to the default
            timer kind of the profiler.
        memory_unit: The memory unit to use for the trial. Defaults to the
            default memory unit of the profiler.
        summary: Whether to add the interval to the summary.

    Yields:
        The interval measured. Values will be nan until the with block is finished.
    """
    with self.profiler(name=name, memory_unit=memory_unit, time_kind=time):
        yield

    if summary:
        profile = self.profiler[name]
        self.summary.update(profile.to_dict(prefix=name))

retrieve #

retrieve(
    key: str, *, check: type[R] | None = None
) -> R | Any

Retrieve items related to the trial.

retrieve
from amltk.optimization import Trial
from amltk.store import PathBucket

bucket = PathBucket("results")

# Create a trial, normally done by an optimizer
trial = Trial.create(name="trial", config={"x": 1}, bucket=bucket)

trial.store({"config.json": trial.config})
config = trial.retrieve("config.json")

print(config)
{'x': 1}
PARAMETER DESCRIPTION
key

The key of the item to retrieve as said in .storage.

TYPE: str

check

If provided, will check that the retrieved item is of the provided type. If not, will raise a TypeError.

TYPE: type[R] | None DEFAULT: None

RETURNS DESCRIPTION
R | Any

The retrieved item.

RAISES DESCRIPTION
TypeError

If check= is provided and the retrieved item is not of the provided type.

Source code in src/amltk/optimization/trial.py
def retrieve(self, key: str, *, check: type[R] | None = None) -> R | Any:
    """Retrieve items related to the trial.

    ```python exec="true" source="material-block" result="python" title="retrieve" hl_lines="7"
    from amltk.optimization import Trial
    from amltk.store import PathBucket

    bucket = PathBucket("results")

    # Create a trial, normally done by an optimizer
    trial = Trial.create(name="trial", config={"x": 1}, bucket=bucket)

    trial.store({"config.json": trial.config})
    config = trial.retrieve("config.json")

    print(config)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        key: The key of the item to retrieve as said in `.storage`.
        check: If provided, will check that the retrieved item is of the
            provided type. If not, will raise a `TypeError`.

    Returns:
        The retrieved item.

    Raises:
        TypeError: If `check=` is provided and  the retrieved item is not of the provided
            type.
    """  # noqa: E501
    return self.bucket[key].load(check=check)

rich_renderables #

rich_renderables() -> Iterable[RenderableType]

The renderables for rich for this report.

Source code in src/amltk/optimization/trial.py
def rich_renderables(self) -> Iterable[RenderableType]:
    """The renderables for rich for this report."""
    from rich.panel import Panel
    from rich.pretty import Pretty
    from rich.table import Table

    items: list[RenderableType] = []
    table = Table.grid(padding=(0, 1), expand=False)

    # Predfined things
    table.add_row("config", Pretty(self.config))

    if self.fidelities:
        table.add_row("fidelities", Pretty(self.fidelities))

    if any(self.extras):
        table.add_row("extras", Pretty(self.extras))

    if self.seed:
        table.add_row("seed", Pretty(self.seed))

    if self.bucket:
        table.add_row("bucket", Pretty(self.bucket))

    if self.metrics:
        items.append(
            Panel(Pretty(self.metrics), title="Metrics", title_align="left"),
        )

    # Dynamic things
    if self.summary:
        table.add_row("summary", Pretty(self.summary))

    if any(self.storage):
        table.add_row("storage", Pretty(self.storage))

    for name, profile in self.profiles.items():
        table.add_row("profile:" + name, Pretty(profile))

    items.append(table)

    yield from items

store #

store(items: Mapping[str, T]) -> None

Store items related to the trial.

store
from amltk.optimization import Trial
from amltk.store import PathBucket

trial = Trial.create(name="trial", config={"x": 1}, bucket=PathBucket("my-trial"))
trial.store({"config.json": trial.config})
print(trial.storage)
{'config.json'}
PARAMETER DESCRIPTION
items

The items to store, a dict from the key to store it under to the item itself.If using a str, Path or PathBucket, the keys of the items should be a valid filename, including the correct extension. e.g. {"config.json": trial.config}

TYPE: Mapping[str, T]

Source code in src/amltk/optimization/trial.py
def store(self, items: Mapping[str, T]) -> None:
    """Store items related to the trial.

    ```python exec="true" source="material-block" result="python" title="store" hl_lines="5"
    from amltk.optimization import Trial
    from amltk.store import PathBucket

    trial = Trial.create(name="trial", config={"x": 1}, bucket=PathBucket("my-trial"))
    trial.store({"config.json": trial.config})
    print(trial.storage)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        items: The items to store, a dict from the key to store it under
            to the item itself.If using a `str`, `Path` or `PathBucket`,
            the keys of the items should be a valid filename, including
            the correct extension. e.g. `#!python {"config.json": trial.config}`
    """  # noqa: E501
    self.bucket.store(items)
    # Add the keys to storage
    self.storage.update(items)

success #

success(**metrics: float | int) -> Report[I]

Generate a success report.

success
from amltk.optimization import Trial, Metric

loss_metric = Metric("loss", minimize=True)

trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss_metric])
report = trial.success(loss=1)

print(report)
Trial.Report(trial=Trial(name='trial', config={'x': 1}, bucket=PathBucket(PosixPath('trial-trial-2024-04-24T14:12:09.965253')), metrics=MetricCollection(metrics={'loss': Metric(name='loss', minimize=True, bounds=None, fn=None)}), created_at=datetime.datetime(2024, 4, 24, 14, 12, 9, 965250), seed=None, fidelities={}, summary={}, storage=set(), extras={}), status=<Status.SUCCESS: 'success'>, reported_at=datetime.datetime(2024, 4, 24, 14, 12, 9, 965346), exception=None, values={'loss': 1})
PARAMETER DESCRIPTION
**metrics

The metrics of the trial, where the key is the name of the metrics and the value is the metric.

TYPE: float | int DEFAULT: {}

RETURNS DESCRIPTION
Report[I]

The report of the trial.

Source code in src/amltk/optimization/trial.py
def success(self, **metrics: float | int) -> Trial.Report[I]:
    """Generate a success report.

    ```python exec="true" source="material-block" result="python" title="success" hl_lines="7"
    from amltk.optimization import Trial, Metric

    loss_metric = Metric("loss", minimize=True)

    trial = Trial.create(name="trial", config={"x": 1}, metrics=[loss_metric])
    report = trial.success(loss=1)

    print(report)
    trial.bucket.rmdir()  # markdown-exec: hide
    ```

    Args:
        **metrics: The metrics of the trial, where the key is the name of the
            metrics and the value is the metric.

    Returns:
        The report of the trial.
    """  # noqa: E501
    values: dict[str, float] = {}

    for metric_def in self.metrics.values():
        if (reported_value := metrics.get(metric_def.name)) is not None:
            values[metric_def.name] = reported_value
        else:
            raise ValueError(
                f" Please provide a value for the metric '{metric_def.name}' as "
                " this is one of the metrics of the trial. "
                f"\n Try `trial.success({metric_def.name}=value, ...)`.",
            )

    # Need to check if anything extra was reported!
    extra = set(metrics.keys()) - self.metrics.keys()
    if extra:
        raise ValueError(
            f"Cannot report `success()` with extra metrics: {extra=}."
            f"\nOnly metrics {list(self.metrics)} as these are the metrics"
            " provided for this trial."
            "\nTo record other numerics, use `trial.summary` instead.",
        )

    return Trial.Report(trial=self, status=Trial.Status.SUCCESS, values=values)

options: members: False

History#

amltk.optimization.history #

The History is used to keep a structured record of what occured with Trials and their associated Reports.

Usage

from amltk.optimization import Trial, History, Metric
from amltk.store import PathBucket

loss = Metric("loss", minimize=True)

def target_function(trial: Trial) -> Trial.Report:
    x = trial.config["x"]
    y = trial.config["y"]
    trial.store({"config.json": trial.config})

    loss = x**2 - y
    return trial.success(loss=loss)

# ... usually obtained from an optimizer
bucket = PathBucket("all-trial-results")
history = History()

for x, y in zip([1, 2, 3], [4, 5, 6]):
    name = f"trial_{x}_{y}"
    trial = Trial.create(name=name, config={"x": x, "y": y}, bucket=bucket / name, metrics=[loss])
    report = target_function(trial)
    history.add(report)

print(history.df())
bucket.rmdir()  # markdon-exec: hide

status trial_seed ... config:x config:y name ... trial_1_4 success ... 1 4 trial_2_5 success ... 2 5 trial_3_6 success ... 3 6 [3 rows x 10 columns]

You'll often need to perform some operations on a History so we provide some utility functions here:

  • filter(key=...) - Filters the history by some predicate, e.g. history.filter(lambda report: report.status == "success")
  • groupby(key=...) - Groups the history by some key, e.g. history.groupby(lambda report: report.config["x"] < 5)
  • sortby(key=...) - Sorts the history by some key, e.g. history.sortby(lambda report: report.profiles["trial"].time.end)

There is also some serialization capabilities built in, to allow you to store your reports and load them back in later:

  • df(...) - Output a pd.DataFrame of all the information available.
  • from_df(...) - Create a History from a pd.DataFrame.

You can also retrieve individual reports from the history by using their name, e.g. history.reports["some-unique-name"] or iterate through the history with for report in history: ....