BaseProfile

class confopt.profile.BaseProfile(sampler_type: str | SamplerType, trainer_preset: str | TrainerPresetType, epochs: int = 50, *, sampler_sample_frequency: str = 'step', is_partial_connection: bool = False, partial_connector_config: dict | None = None, dropout: float | None = None, perturbation: str | None = None, perturbator_sample_frequency: str = 'epoch', perturbator_config: dict | None = None, entangle_op_weights: bool = False, lora_rank: int = 0, lora_warm_epochs: int = 0, lora_toggle_epochs: list[int] | None = None, lora_toggle_probability: float | None = None, seed: int = 100, oles: bool = False, calc_gm_score: bool = False, prune_epochs: list[int] | None = None, prune_fractions: list[float] | None = None, is_arch_attention_enabled: bool = False, regularization_config: dict | None = None, sampler_arch_combine_fn: str = 'default', pt_select_architecture: bool = False, searchspace_domain: str | None = None, use_auxiliary_skip_connection: bool = False, searchspace_subspace: str | None = None, early_stopper: str | None = None, early_stopper_config: dict | None = None, synthetic_dataset_config: dict | None = None, extra_config: dict | None = None, use_dynamic_exploration: bool = False, dynamic_exploration_config: dict | None = None)

Bases: object

Base class for configuring the supernet and the experiment’s Profile.

This class serves as a foundational component for setting up various aspects of training the supernet, including search spaces, epochs, and more advanced configurations such as dropout rates, perturbation settings, regularization settings, pruning and LoRA configurations. It offers flexibility in modifying the training experiment of the supernet through multiple setup methods. For further details on the specific configurations, please refer to the individual methods. The examples provided in each of the components presents how can each component be used.

Parameters:

sampler_type (SamplerType or str) – Type of sampler to use.
searchspace (SearchSpaceType or str) – Type of search space.
epochs (int) – Number of training epochs.
sampler_sample_frequency (str) – Frequency of sampling. Valid values are ‘step’ or ‘epoch’. Defaults to ‘step’.
is_partial_connection (bool) – Flag to enable partial connections in the supernet. Defaults to False.
partial_connector_config (dict, optional) – Configuration for partial connector if is_partial_connection is True.
dropout (float, optional) – Dropout operation rate for architectural parameters. Defaults to None.
perturbation (str, optional) – Type of perturbation to apply. Valid values are ‘adversarial’ and ‘random’. Defaults to None.
perturbator_sample_frequency (str) – Sampling frequency for perturbator. Defaults to ‘epoch’.
perturbator_config (dict, optional) – Configuration for the perturbator.
entangle_op_weights (bool) – Flag to enable weight entanglement between candidate operations. Defaults to False.
lora_rank (int) – Rank for LoRA configuration. Defaults to 0.
lora_warm_epochs (int) – Number of warm-up epochs for LoRA. Defaults to 0.
lora_toggle_epochs (list[int], optional) – Specific epochs to toggle LoRA configuration. Defaults to None.
lora_toggle_probability (float, optional) – Probability to toggle LoRA configuration. Defaults to None.
seed (int) – Seed for random number generators to ensure reproducibility. Defaults to 100.
oles (bool) – Flag to enable OLES. Defaults to False.
calc_gm_score (bool) – Flag to calculate GM score for OLES. Required if oles is True.
prune_epochs (list[int], optional) – List of epochs to apply pruning at. Defaults to None.
prune_fractions (list[float], optional) – Fractions to prune in specified epochs. Defaults to None.
is_arch_attention_enabled (bool) – Flag to enable Multi-head attention for architectural parameters Defaults to False.
is_regularization_enabled (bool) – Flag to enable regularization during training. Defaults to False.
regularization_config (dict, optional) – Configuration for regularization if regularization is enabled.
sampler_arch_combine_fn (str) – Function to combine architecture samples. Used in FairDARTS. Defaults to ‘default’.
pt_select_architecture (bool) – Flag to enable supernet’s projection. Defaults to False.
searchspace_domain (str, optional) – Domain/Task of the search space TransNasBench101. Defaults to None.
use_auxiliary_skip_connection (bool) – Flag to use auxiliary skip connections in the supernet’s edges(OperationBlock). Defaults to False.
searchspace_subspace (str, optional) – Subspace of the search space NB1Shot1. Defaults to None.
early_stopper (str, optional) – Strategy for early stopping. Defaults to None.
early_stopper_config (dict, optional) – Configuration for early stopping if early_stopper is not None.
synthetic_dataset_config (dict, optional) – Configuration for using a synthetic dataset. Defaults to None.
extra_config (dict, optional) – Any additional configurations that may be needed for example could be used for Weights & Biases metadata.
use_dynamic_exploration (bool) – Flag to use dynamic exploration connections in the supernet’s edges(OperationBlock). Defaults to False.
dynamic_exploration_config (dict, optional) – Configuration for dynamic exploration if dynamic exploration is enabled.

_initialize_dropout_config() → None

Initialize the configuration for the dropout module.

Parameters:: None
Returns:: None

_initialize_partial_connector_config() → None

Initialize the configuration for the partial connector. If the is_partial_connection flag is disabled, the configuration is set to None, otherwise it is set to a default configuration.

Note

The default configuration for the partial connector includes: - k: 4 (1/k of the number of channels would be used during training) - num_warm_epoch: 15 (indicates the number of warm-up epochs)

Parameters:: None
Returns:: None

_initialize_perturbation_config() → None

Initialize the configuration for the perturbation based on the perturb_type.

Parameters:: None
Returns:: None

_initialize_regularization_config() → None

Initialize the configuration for the regularization module.

Parameters:: None
Returns:: None

_initialize_sampler_config() → None

Initializes the configuration for the sampler. The inherited classes override this with their own implementations.

Parameters:: None
Returns:: None

_initialize_trainer_config() → None

Initialize the configuration for the trainer based on the trainer_preset.

Parameters:: None
Returns:: None

_initialize_trainer_config_1shot1() → None

Initialize the configuration for the trainer based on the NasBench1Shot1 search space.

Parameters:: None
Returns:: None

_initialize_trainer_config_darts() → None

Initialize the configuration for the trainer based on the DARTS search space.

Parameters:: None
Returns:: None

_initialize_trainer_config_nb201() → None

Initialize the configuration for the trainer based on the NB201 search space.

Parameters:: None
Returns:: None

_initialize_trainer_config_tnb101() → None

Initialize the configuration for the trainer based on the TransNasBench101 search space.

Parameters:: None
Returns:: None

_set_dropout(dropout: float | None = None) → None

Initializes the default configurations for dropout.

Parameters:: dropout (float | None) – Dropout operation rate for architectural parameters. Must be in the range [0, 1). Defaults to None.
Returns:: None

_set_dynamic_exploration_configs(attention_weight: float = 1, min_attention_weight: float = 0.0001) → None

Set the configuration for the dynamic exploration oneshot module (DynamicAttentionExplorer).

Parameters:

attention_weight (float) – Initial attention weight for the DAN module. Defaults to 1.
min_attention_weight (float) – Minimum attention weight possible at the last schedule step. Defaults to 1e-4.

Returns:

None

_set_lora_configs(lora_rank: int = 0, lora_warm_epochs: int = 0, lora_dropout: float = 0, lora_alpha: int = 1, lora_toggle_probability: float | None = None, merge_weights: bool = True, toggle_epochs: list[int] | None = None) → None

Set the configuration for LoRA (Low-Rank Adaptation) layers.

Parameters:

lora_rank (int) – Rank for LoRA configuration. Defaults to 0.
lora_warm_epochs (int) – Number of warm-up epochs before initializing LoRA _A_ and LoRA _B_. Defaults to 0.
lora_dropout (float) – Dropout rate for LoRA layers. Defaults to 0.
lora_alpha (int) – Scaling factor for LoRA layers. Defaults to 1.
lora_toggle_probability (float | None) – Probability to toggle LoRA and deactivate it. Defaults to None.
merge_weights (bool) – Flag to merge LoRA weights. Defaults to True.
toggle_epochs (list[int] | None) – Specific epochs to toggle LoRA configuration. Defaults to None.

Returns:

None

_set_oles_configs(oles: bool = False, calc_gm_score: bool = False) → None

Set the configuration for OLES (Operation-Level Early Stopping).

Parameters:

oles (bool) – Flag to enable OLES. Defaults to False.
calc_gm_score (bool) – Flag to calculate Gradient Matching score for OLES. Defaults to False.

Raises:

UserWarning – If OLES is enabled but calc_gm_score is not set to True.

Returns:

None

_set_partial_connector(is_partial_connection: bool = False) → None

Initializes the default configuration for the partial connector.

Parameters:: is_partial_connection (bool) – Flag to enable partial connections in the supernet. Defaults to False.
Returns:: None

_set_perturb(perturb_type: str | None = None, perturbator_sample_frequency: str = 'epoch') → None

Set the configuration for perturbation of the supernet.

Parameters:

perturb_type (str | None) – Type of perturbation to apply. Valid values are ‘adversarial’ and ‘random’.
perturbator_sample_frequency (str) – Sampling frequency for perturbator. Defaults to ‘epoch’.

Raises:

AssertionError – If perturbator_sample_frequency is not ‘epoch’ or ‘step’.
AssertionError – If perturb_type is neither the string values ‘adversarial’, ‘random’, ‘none’ or None.

Returns:

None

_set_pruner_configs(prune_epochs: list[int] | None = None, prune_fractions: list[float] | None = None) → None

Set the configuration for the pruning of the supernet.

Parameters:

prune_epochs (list[int] | None) – List of epochs to apply pruning at.
prune_fractions (list[float] | None) – List of fractions to prune in the specified epochs.

Raises:

AssertionError – If prune_epochs and prune_fractions are not of the same length.

Returns:

None

_set_pt_select_configs(pt_select_architecture: bool = False, pt_projection_criteria: Literal['acc', 'loss'] = 'acc', pt_projection_interval: int = 10) → None

Set the configuration for the projection-based selection of the supernet.

Parameters:

pt_select_architecture (bool) – Flag to enable projection-based selection of the supernet.
pt_projection_criteria (str) – Criteria for projection. Can be ‘acc’ or ‘loss’.
pt_projection_interval (int) – Interval for applying the projection while training.

Returns:

None

configure_dropout(**kwargs) → None

Configure the dropout module for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

p (float): Dropout probability of the architecture parameters.

p_min (float): Minimum dropout probability.

anneal_frequency (str): Frequency of annealing. Can be ‘epoch’ or ‘step’.

anneal_type (str): Type of annealing. Can be ‘linear’ or ‘cosine’.

max_iter (int): Maximum iterations for annealing.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_dynamic_explorer(**kwargs) → None

Configure the synthetic dataset for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

attention_weight (float): Initial attention weight for DAN.

minimum_attention_weight (float): Minimum attention weight possible at the last schedule step.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_early_stopper(**config: Any) → None

Configure the early stopping mechanism for the supernet.

Parameters:

**config –

Arbitrary keyword arguments. Possible keys include:

max_skip_normal (int): Maximum number of skip connections in normal cells

max_skip_reduce (int): Maximum number of skip connections in reduction cells

min_epochs (int): Minimum number of epochs to wait before stopping

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_extra(**config) → None

Configure any extra settings for the supernet. Could be useful for tracking Weights & Biases metadata.

Parameters:: **config – Arbitrary keyword arguments.
Returns:: None

configure_lora(**kwargs) → None

Configure the LoRA (Low-Rank Adaptation) module for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

r (int): Rank for LoRA configuration.

lora_dropout (float): Dropout rate for LoRA layers.

lora_alpha (int): Scaling factor for LoRA layers.

merge_weights (bool): Flag to merge LoRA weights.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_oles(**kwargs) → None

Configure the OLES (Operation-Level Early Stopping) module for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

oles (bool): Flag to enable OLES. Defaults to False.

calc_gm_score (bool): Flag to calculate GM score for OLES. Defaults to False.

frequency (int): Accumalative value of GM score to check the threashold.

threshold (float): Threshold of GM score to stop the training. Defaults to 0.4.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_partial_connector(**kwargs) → None

Configure the partial connector component for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

k (int): 1/(Number of connections to keep).

num_warm_epoch (int): Number of warm-up epochs.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_perturbator(**kwargs) → None

Configures the perturbator used for training the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

Possible keys include: for perturbation type ‘adversarial’:

epsilon (float): Perturbation strength.

data (tuple): Tuple of input data and target labels.

loss_criterion (torch.nn.Module): Loss function to use.

steps (int): Number of steps for perturbation.

random_start (bool): Flag to start with a random perturbation.

sample_frequency (str): Frequency of sampling. Can be ‘epoch’ or ‘step’.

for perturbation type ‘random’:

epsilon (float): Perturbation strength.

sample_frequency (str): Frequency of sampling. Can be ‘epoch’ or ‘step’.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_pt_selection(**kwargs) → None

Configure the projection-based selection for the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

projection_interval (int): Interval for applying the projection while training.

projection_criteria (str): Criteria for projection. Can be ‘acc’ or ‘loss’.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_regularization(**kwargs) → None

Configure the regularization module for the supernet.

There are three different types of regularizations in Configurable Optimizer:

FairDarts: FairDARTSRegularizationTerm
Flops: FLOPSRegularizationTerm
Drnas: DrNASRegularizationTerm.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

reg_weights (list[float]): List of weights for each regularization term. loss_weight (float): Weight for the loss term. active_reg_terms (list[str]): List of types of regularization terms.

drnas_config (dict): Configuration for DRNAS regularization.

This dictionary can contain the following keys:: reg_scale (float): Scale for the regularization term. reg_type (str): Type of regularization. Can be ‘l1’ or ‘kl’.

flops_config (dict): Configuration for FLOPS regularization. fairdarts_config (dict): Configuration for FairDARTS regularization.

configure_sampler(**kwargs) → None

Configures the sampler used for training the supernet based on attributes of the type of sampler.

Parameters:

**kwargs – Keyword arguments for configuring the sampler. The keys should
parameters. (match the expected configuration)

Raises:

AssertionError – If any of the provided configuration keys are not valid for
the sampler type. –

Returns:

None

configure_searchspace(**config: Any) → None

Configure the search space for the supernet.

Parameters:: **config – Arbitrary keyword arguments. Possible depend on the the trainer preset type. For more information please check the Parameters of the supernet of each search space.
Returns:: None

configure_synthetic_dataset(**config: Any) → None

Configure the synthetic dataset for the supernet.

Parameters:

**config –

Arbitrary keyword arguments. Possible keys include:

signal_width (int): Width of the signal Patch.

shortcut_width (int): Width of the shortcut Patch.

shortcut_strength (int): Probability of shourcut single being the valid single.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

configure_trainer(**kwargs) → None

Configure the trainer component of the supernet.

Parameters:

**kwargs –

Arbitrary keyword arguments. Possible keys include:

lr (float): Learning rate of the optimizer.

arch_lr (float): Learning rate of architecture parameters.

epochs (int): Number of training epochs.

optim (str): Optimizer type. Can be ‘sgd’, ‘adam’, etc.

arch_optim (str): Optimizer type of architecture parameters.

optim_config (dict): Additional configuration of the optimizer.

momentum (float): Momentum for model’s optimizer.

nesterov (bool): Flag to use Nesterov momentum for model’s optimizer.

weight_decay (float): Weight decay for model’s optimizer.

arch_optim_config (dict): Additional configuration of the architecture optimizer.

weight_decay (float): Weight decay for architecture’s optimizer.

beta (float): Beta for architecture’s optimizer.

scheduler (str): Scheduler’s type.

scheduler_config (dict): Additional configuration of the scheduler.

criterion (str): Loss function type.

batch_size (int): Batch size for training.

learning_rate_min (float): Minimum learning rate.

cutout (int): Enables cutout. If cutout > 0, cutout is applied.

cutout_length (int): Cutout length.

train_portion (float): Portion of the training data to use for training the model’s parameter.

use_data_parallel (bool): Flag to use data parallelism.

checkpointing_freq (int): Frequency of checkpointing.

seed (int): Seed for random number generators.

Raises:

AssertionError – If any of the provided configuration keys are not valid.

Returns:

None

get_config() → dict

This method returns a dictionary representation of the Profile class. The configurations are used for training the supernet.

Parameters:: None
Returns:: dict – A dictionary representation of the Profile class.

get_run_description() → str

This method returns a string description of the run configuration. The description is used for tracking purposes in Weights & Biases.

Parameters:: None
Returns:: str – A string describing the run configuration.