Skip to content

Node

A pipeline consists of Nodes, which hold the various attributes required to build a pipeline, such as the .item, its .space, its .config and so on.

The Nodes are connected to each in a parent-child relation ship where the children are simply the .nodes that the parent leads to.

To give these attributes and relations meaning, there are various subclasses of Node which give different syntactic meanings when you want to construct something like a search_space() or build() some concrete object out of the pipeline.

For example, a Sequential node gives the meaning that each of its children in .nodes should follow one another while something like a Choice gives the meaning that only one of its children should be chosen.

You will likely never have to create a Node directly, but instead use the various components to create the pipeline.

Hashing

When hashing a node, i.e. to put it in a set or as a key in a dict, only the name of the node and the hash of its children is used. This means that two nodes with the same connectivity will be equalling hashed,

Equality

When considering equality, this will be done by comparing all the fields of the node. This include even the parent and branches fields. This means two nodes are considered equal if they look the same and they are connected in to nodes that also look the same.

class RichOptions #

Bases: NamedTuple

Options for rich printing.

class ParamRequest
dataclass
#

Bases: Generic[T]

A parameter request for a node. This is most useful for things like seeds.

key: str
attr
#

The key to request under.

default: T | object
classvar attr
#

The default value to use if the key is not found.

If left as _NotSet (default) then an error will be raised if the parameter is not found during configuration with configure().

has_default: bool
prop
#

Whether this request has a default value.

class Node(*nodes, name, item=None, config=None, space=None, fidelities=None, config_transform=None, meta=None)
dataclass
#

Bases: RichRenderable, Generic[Item, Space]

The core node class for the pipeline.

These are simple objects that are named and linked together to form a chain. They are then wrapped in a Pipeline object to provide a convenient interface for interacting with the chain.

Source code in src/amltk/pipeline/node.py
def __init__(
    self,
    *nodes: Node,
    name: str,
    item: Item | Callable[[Item], Item] | None = None,
    config: Config | None = None,
    space: Space | None = None,
    fidelities: Mapping[str, Any] | None = None,
    config_transform: Callable[[Config, Any], Config] | None = None,
    meta: Mapping[str, Any] | None = None,
):
    """Initialize a choice."""
    super().__init__()
    object.__setattr__(self, "name", name)
    object.__setattr__(self, "item", item)
    object.__setattr__(self, "config", config)
    object.__setattr__(self, "space", space)
    object.__setattr__(self, "fidelities", fidelities)
    object.__setattr__(self, "config_transform", config_transform)
    object.__setattr__(self, "meta", meta)
    object.__setattr__(self, "nodes", nodes)

name: str
classvar attr
#

Name of the node

item: Callable[..., Item] | Item | None
classvar attr
#

The item attached to this node

nodes: tuple[Node, ...]
classvar attr
#

The nodes that this node leads to.

config: Config | None
classvar attr
#

The configuration for this node

space: Space | None
classvar attr
#

The search space for this node

fidelities: Mapping[str, Any] | None
classvar attr
#

The fidelities for this node

config_transform: Callable[[Config, Any], Config] | None
classvar attr
#

A function that transforms the configuration of this node

meta: Mapping[str, Any] | None
classvar attr
#

Any meta information about this node

RICH_OPTIONS: RichOptions
classvar
#

Options for rich printing

def __getitem__(key) #

Get the node with the given name.

Source code in src/amltk/pipeline/node.py
def __getitem__(self, key: str) -> Node:
    """Get the node with the given name."""
    found = first_true(
        self.nodes,
        None,
        lambda node: node.name == key,
    )
    if found is None:
        raise KeyError(
            f"Could not find node with name {key} in '{self.name}'."
            f" Available nodes are: {', '.join(node.name for node in self.nodes)}",
        )

    return found

def configure(config, *, prefixed_name=None, transform_context=None, params=None) #

Configure this node and anything following it with the given config.

PARAMETER DESCRIPTION
config

The configuration to apply

TYPE: Config

prefixed_name

Whether items in the config are prefixed by the names of the nodes. * If None, the default, then prefixed_name will be assumed to be True if this node has a next node or if the config has keys that begin with this nodes name. * If True, then the config will be searched for items prefixed by the name of the node (and subsequent chained nodes). * If False, then the config will be searched for items without the prefix, i.e. the config keys are exactly those matching this nodes search space.

TYPE: bool | None DEFAULT: None

transform_context

Any context to give to config_transform= of individual nodes.

TYPE: Any | None DEFAULT: None

params

The params to match any requests when configuring this node. These will match against any ParamRequests in the config and will be used to fill in any missing values.

TYPE: Mapping[str, Any] | None DEFAULT: None

RETURNS DESCRIPTION
Self

The configured node

Source code in src/amltk/pipeline/node.py
def configure(
    self,
    config: Config,
    *,
    prefixed_name: bool | None = None,
    transform_context: Any | None = None,
    params: Mapping[str, Any] | None = None,
) -> Self:
    """Configure this node and anything following it with the given config.

    Args:
        config: The configuration to apply
        prefixed_name: Whether items in the config are prefixed by the names
            of the nodes.
            * If `None`, the default, then `prefixed_name` will be assumed to
                be `True` if this node has a next node or if the config has
                keys that begin with this nodes name.
            * If `True`, then the config will be searched for items prefixed
                by the name of the node (and subsequent chained nodes).
            * If `False`, then the config will be searched for items without
                the prefix, i.e. the config keys are exactly those matching
                this nodes search space.
        transform_context: Any context to give to `config_transform=` of individual
            nodes.
        params: The params to match any requests when configuring this node.
            These will match against any ParamRequests in the config and will
            be used to fill in any missing values.

    Returns:
        The configured node
    """
    # Get the config for this node
    match prefixed_name:
        case True:
            config = mapping_select(config, f"{self.name}:")
        case False:
            pass
        case None if any(k.startswith(f"{self.name}:") for k in config):
            config = mapping_select(config, f"{self.name}:")
        case None:
            pass

    _kwargs: dict[str, Any] = {}

    # Configure all the branches if exists
    if len(self.nodes) > 0:
        nodes = tuple(
            node.configure(
                config,
                prefixed_name=True,
                transform_context=transform_context,
                params=params,
            )
            for node in self.nodes
        )
        _kwargs["nodes"] = nodes

    this_config = {
        hp: v
        for hp, v in config.items()
        if (
            ":" not in hp
            and not any(hp.startswith(f"{node.name}") for node in self.nodes)
        )
    }
    if self.config is not None:
        this_config = {**self.config, **this_config}

    this_config = dict(self._fufill_param_requests(this_config, params=params))

    if self.config_transform is not None:
        this_config = dict(self.config_transform(this_config, transform_context))

    if len(this_config) > 0:
        _kwargs["config"] = dict(this_config)

    return self.mutate(**_kwargs)

def fidelity_space() #

Get the fidelities for this node and any connected nodes.

Source code in src/amltk/pipeline/node.py
def fidelity_space(self) -> dict[str, Any]:
    """Get the fidelities for this node and any connected nodes."""
    fids = {}
    for node in self.nodes:
        fids.update(prefix_keys(node.fidelity_space(), f"{self.name}:"))

    return fids

def linearized_fidelity(value) #

Get the liniearized fidelities for this node and any connected nodes.

PARAMETER DESCRIPTION
value

The value to linearize. Must be between [0, 1]

TYPE: float

Return

dictionary from key to it's linearized fidelity.

Source code in src/amltk/pipeline/node.py
def linearized_fidelity(self, value: float) -> dict[str, int | float | Any]:
    """Get the liniearized fidelities for this node and any connected nodes.

    Args:
        value: The value to linearize. Must be between [0, 1]

    Return:
        dictionary from key to it's linearized fidelity.
    """
    assert 1.0 <= value <= 100.0, f"{value=} not in [1.0, 100.0]"  # noqa: PLR2004
    d = {}
    for node in self.nodes:
        node_fids = prefix_keys(
            node.linearized_fidelity(value),
            f"{self.name}:",
        )
        d.update(node_fids)

    if self.fidelities is None:
        return d

    for f_name, f_range in self.fidelities.items():
        match f_range:
            case (int() | float(), int() | float()):
                low, high = f_range
                fid = low + (high - low) * value
                fid = low + (high - low) * (value - 1) / 100
                fid = fid if isinstance(low, float) else round(fid)
                d[f_name] = fid
            case _:
                raise ValueError(
                    f"Invalid fidelities to linearize {f_range} for {f_name}"
                    f" in {self}. Only supports ranges of the form (low, high)",
                )

    return prefix_keys(d, f"{self.name}:")

def iter() #

Iterate the the nodes, including this node.

YIELDS DESCRIPTION
Node

The nodes connected to this node

Source code in src/amltk/pipeline/node.py
def iter(self) -> Iterator[Node]:
    """Iterate the the nodes, including this node.

    Yields:
        The nodes connected to this node
    """
    yield self
    for node in self.nodes:
        yield from node.iter()

def mutate(**kwargs) #

Mutate the node with the given keyword arguments.

PARAMETER DESCRIPTION
**kwargs

The keyword arguments to mutate

TYPE: Any DEFAULT: {}

RETURNS DESCRIPTION
Self

Self The mutated node

Source code in src/amltk/pipeline/node.py
def mutate(self, **kwargs: Any) -> Self:
    """Mutate the node with the given keyword arguments.

    Args:
        **kwargs: The keyword arguments to mutate

    Returns:
        Self
            The mutated node
    """
    _args = ()
    _kwargs = {**self.__dict__, **kwargs}

    # If there's nodes in kwargs, we have to check if it's
    # a positional or keyword argument and handle accordingly.
    if (nodes := _kwargs.pop("nodes", None)) is not None:
        match self._NODES_INIT:
            case "args":
                _args = nodes
            case "kwargs":
                _kwargs["nodes"] = nodes
            case None if len(nodes) == 0:
                pass  # Just ignore it, it's popped out
            case None:
                raise ValueError(
                    "Cannot mutate nodes when __init__ does not accept nodes",
                )

    # If there's a config in kwargs, we have to check if it's actually got values
    config = _kwargs.pop("config", None)
    if config is not None and len(config) > 0:
        _kwargs["config"] = config

    # Lastly, we remove anything that can't be passed to kwargs of the
    # subclasses __init__
    _available_kwargs = inspect.signature(self.__init__).parameters.keys()  # type: ignore
    for k in list(_kwargs.keys()):
        if k not in _available_kwargs:
            _kwargs.pop(k)

    return self.__class__(*_args, **_kwargs)

def copy() #

Copy this node, removing all links in the process.

Source code in src/amltk/pipeline/node.py
def copy(self) -> Self:
    """Copy this node, removing all links in the process."""
    return self.mutate()

def path_to(key) #

Find a path to the given node.

PARAMETER DESCRIPTION
key

The key to search for or a function that returns True if the node is the desired node

TYPE: str | Node | Callable[[Node], bool]

RETURNS DESCRIPTION
list[Node] | None

The path to the node if found, else None

Source code in src/amltk/pipeline/node.py
def path_to(self, key: str | Node | Callable[[Node], bool]) -> list[Node] | None:
    """Find a path to the given node.

    Args:
        key: The key to search for or a function that returns True if the node
            is the desired node

    Returns:
        The path to the node if found, else None
    """
    # We found our target, just return now

    match key:
        case Node():
            pred = lambda node: node == key
        case str():
            pred = lambda node: node.name == key
        case _:
            pred = key

    for path, node in self.walk():
        if pred(node):
            return path

    return None

def walk(path=None) #

Walk the nodes in this chain.

PARAMETER DESCRIPTION
path

The current path to this node

TYPE: Sequence[Node] | None DEFAULT: None

YIELDS DESCRIPTION
list[Node]

The parents of the node and the node itself

Source code in src/amltk/pipeline/node.py
def walk(
    self,
    path: Sequence[Node] | None = None,
) -> Iterator[tuple[list[Node], Node]]:
    """Walk the nodes in this chain.

    Args:
        path: The current path to this node

    Yields:
        The parents of the node and the node itself
    """
    path = list(path) if path is not None else []
    yield path, self

    for node in self.nodes:
        yield from node.walk(path=[*path, self])

def find(key, default=None) #

Find a node in that's nested deeper from this node.

PARAMETER DESCRIPTION
key

The key to search for or a function that returns True if the node is the desired node

TYPE: str | Node | Callable[[Node], bool]

default

The value to return if the node is not found. Defaults to None

TYPE: T | None DEFAULT: None

RETURNS DESCRIPTION
Node | T | None

The node if found, otherwise the default value. Defaults to None

Source code in src/amltk/pipeline/node.py
def find(
    self,
    key: str | Node | Callable[[Node], bool],
    default: T | None = None,
) -> Node | T | None:
    """Find a node in that's nested deeper from this node.

    Args:
        key: The key to search for or a function that returns True if the node
            is the desired node
        default: The value to return if the node is not found. Defaults to None

    Returns:
        The node if found, otherwise the default value. Defaults to None
    """
    itr = self.iter()
    match key:
        case Node():
            return first_true(itr, default, lambda node: node == key)
        case str():
            return first_true(itr, default, lambda node: node.name == key)
        case _:
            return first_true(itr, default, key)  # type: ignore

def search_space(parser, *parser_args, **parser_kwargs) #

Get the search space for this node.

PARAMETER DESCRIPTION
parser

The parser to use. This can be a function that takes in the node and returns the search space or a string that is one of:

TYPE: Callable[Concatenate[Node, P], ParserOutput] | Literal['configspace', 'optuna']

parser_args

The positional arguments to pass to the parser

TYPE: args DEFAULT: ()

parser_kwargs

The keyword arguments to pass to the parser

TYPE: kwargs DEFAULT: {}

RETURNS DESCRIPTION
ParserOutput | ConfigurationSpace | OptunaSearchSpace

The search space

Source code in src/amltk/pipeline/node.py
def search_space(
    self,
    parser: (
        Callable[Concatenate[Node, P], ParserOutput]
        | Literal["configspace", "optuna"]
    ),
    *parser_args: P.args,
    **parser_kwargs: P.kwargs,
) -> ParserOutput | ConfigurationSpace | OptunaSearchSpace:
    """Get the search space for this node.

    Args:
        parser: The parser to use. This can be a function that takes in
            the node and returns the search space or a string that is one of:

            * `#!python "configspace"`: Build a
                [`ConfigSpace.ConfigurationSpace`](https://automl.github.io/ConfigSpace/master/)
                out of this node.
            * `#!python "optuna"`: Build a dict of hyperparameters that Optuna can
                use in its [ask and tell methods](https://optuna.readthedocs.io/en/stable/tutorial/20_recipes/009_ask_and_tell.html#define-and-run)

        parser_args: The positional arguments to pass to the parser
        parser_kwargs: The keyword arguments to pass to the parser

    Returns:
        The search space
    """
    match parser:
        case "configspace":
            from amltk.pipeline.parsers.configspace import parser as cs_parser

            return cs_parser(self, *parser_args, **parser_kwargs)  # type: ignore
        case "optuna":
            from amltk.pipeline.parsers.optuna import parser as optuna_parser

            return optuna_parser(self, *parser_args, **parser_kwargs)  # type: ignore
        case str():  # type: ignore
            raise ValueError(
                f"Invalid str for parser {parser}. "
                "Please use 'configspace' or 'optuna' or pass in your own"
                " parser function",
            )
        case _:
            return parser(self, *parser_args, **parser_kwargs)

def build(builder, *builder_args, **builder_kwargs) #

Build a concrete object out of this node.

PARAMETER DESCRIPTION
builder

The builder to use. This can be a function that takes in the node and returns the object or a string that is one of:

TYPE: Callable[Concatenate[Node, P], BuilderOutput] | Literal['sklearn']

builder_args

The positional arguments to pass to the builder

TYPE: args DEFAULT: ()

builder_kwargs

The keyword arguments to pass to the builder

TYPE: kwargs DEFAULT: {}

RETURNS DESCRIPTION
BuilderOutput | Pipeline

The built object

Source code in src/amltk/pipeline/node.py
def build(
    self,
    builder: Callable[Concatenate[Node, P], BuilderOutput] | Literal["sklearn"],
    *builder_args: P.args,
    **builder_kwargs: P.kwargs,
) -> BuilderOutput | SklearnPipeline:
    """Build a concrete object out of this node.

    Args:
        builder: The builder to use. This can be a function that takes in
            the node and returns the object or a string that is one of:

            * `#!python "sklearn"`: Build a
                [`sklearn.pipeline.Pipeline`][sklearn.pipeline.Pipeline]
                out of this node.

        builder_args: The positional arguments to pass to the builder
        builder_kwargs: The keyword arguments to pass to the builder

    Returns:
        The built object
    """
    match builder:
        case "sklearn":
            from amltk.pipeline.builders.sklearn import build as _build

            return _build(self, *builder_args, **builder_kwargs)  # type: ignore
        case _:
            return builder(self, *builder_args, **builder_kwargs)

def display(*, full=False) #

Display this node.

PARAMETER DESCRIPTION
full

Whether to display the full node or just a summary

TYPE: bool DEFAULT: False

Source code in src/amltk/pipeline/node.py
def display(self, *, full: bool = False) -> RenderableType:
    """Display this node.

    Args:
        full: Whether to display the full node or just a summary
    """
    if not full:
        return self.__rich__()

    from rich.console import Group as RichGroup

    return RichGroup(*self._rich_iter())

def request(key, default=_NotSet) #

Create a new parameter request.

PARAMETER DESCRIPTION
key

The key to request under.

TYPE: str

default

The default value to use if the key is not found. If left as _NotSet (default) then the key will be removed from the config once configure is called and nothing has been provided.

TYPE: T | object DEFAULT: _NotSet

Source code in src/amltk/pipeline/node.py
def request(key: str, default: T | object = _NotSet) -> ParamRequest[T]:
    """Create a new parameter request.

    Args:
        key: The key to request under.
        default: The default value to use if the key is not found.
            If left as `_NotSet` (default) then the key will be removed from the
            config once [`configure`][amltk.pipeline.Node.configure] is called and
            nothing has been provided.
    """
    return ParamRequest(key=key, default=default)