GDAS

Bases: DARTSOptimizer

Implementation of the GDAS optimizer introduced in "Searching for a Robust Neural Architecture in Four GPU Hours" by Dong and Yang (2019). Inherits functionalities from DARTSOptimizer and includes additional functionalities specific to GDAS.

`init(learning_rate=0.025, momentum=0.9, weight_decay=0.0003, grad_clip=5, unrolled=False, arch_learning_rate=0.0003, arch_weight_decay=0.001, op_optimizer='SGD', arch_optimizer='Adam', loss_criteria='CrossEntropyLoss', epochs=50, tau_min=0.1, tau_max=10.0, **kwargs)`

Initialize a new instance of the GDASOptimizer class.

Parameters:

Name	Type	Description	Default
`epochs`	`int`	Total number of training epochs.	`50`
`tau_max`	`float`	Initial value of tau.	`10.0`
`tau_min`	`float`	The minimum value to which tau is decayed.	`0.1`
`op_optimizer`	`str`	The optimizer type for operation weights. E.g., 'SGD'	`'SGD'`
`arch_optimizer`	`str`	The optimizer type for architecture weights. E.g., 'Adam'	`'Adam'`
`loss_criteria`	`str`	Loss criteria. E.g., 'CrossEntropyLoss'	`'CrossEntropyLoss'`
`grad_clip`	`float`	Clipping of the gradients. Default None.	`5`
`**kwargs`		Additional keyword arguments.	`{}`

`adapt_search_space(search_space, dataset, scope=None)`

Adapt the search space for GDAS architecture search.

Parameters:

Name	Description	Default
`search_space`	The initial search space.	required
`dataset`	The dataset for training/validation.	required
`scope`	Scope to update in the search space. Default is None.	`None`

`new_epoch(epoch)`

Update the tau parameter at the edges at the beginning of each new epoch.

Parameters:

Name	Type	Description	Default
`epoch`	`int`	Current epoch number.	required

`remove_sampled_alphas(edge)` `staticmethod`

Remove sampled architecture weights (alphas) from the edge's data.

Parameters:

Name	Type	Description	Default
`edge`		The edge in the computation graph where the sampled architecture weights are to be removed.	required

`sample_alphas(edge, tau)` `staticmethod`

Sample architecture weights (alphas) using the Gumbel-Softmax distribution parameterized by tau.

Parameters:

Name	Type	Description	Default
`edge`		The edge in the computation graph where the sampled architecture weights are set.	required
`tau`	`torch.Tensor`	The tau parameter controlling the temperature of the Gumbel-Softmax distribution.	required

`step(data_train, data_val)`

Perform a single optimization step for both architecture and operation weights.

Parameters:

Name	Type	Description	Default
`data_train`	`tuple`	Training data as a tuple of inputs and labels.	required
`data_val`	`tuple`	Validation data as a tuple of inputs and labels.	required

Returns:

Name	Type	Description
`tuple`		Logits for training data, logits for validation data, loss for training data, loss for validation data.

`update_ops(edge)` `staticmethod`

Replace the primitive operations at the edge with the GDAS-specific GDASMixedOp.

Parameters:

Name	Type	Description	Default
`edge`		The edge in the computation graph where the operations are to be replaced.	required

GDAS

__init__(learning_rate=0.025, momentum=0.9, weight_decay=0.0003, grad_clip=5, unrolled=False, arch_learning_rate=0.0003, arch_weight_decay=0.001, op_optimizer='SGD', arch_optimizer='Adam', loss_criteria='CrossEntropyLoss', epochs=50, tau_min=0.1, tau_max=10.0, **kwargs)

adapt_search_space(search_space, dataset, scope=None)

new_epoch(epoch)

remove_sampled_alphas(edge) staticmethod

sample_alphas(edge, tau) staticmethod

step(data_train, data_val)

update_ops(edge) staticmethod

`init(learning_rate=0.025, momentum=0.9, weight_decay=0.0003, grad_clip=5, unrolled=False, arch_learning_rate=0.0003, arch_weight_decay=0.001, op_optimizer='SGD', arch_optimizer='Adam', loss_criteria='CrossEntropyLoss', epochs=50, tau_min=0.1, tau_max=10.0, **kwargs)`

`adapt_search_space(search_space, dataset, scope=None)`

`new_epoch(epoch)`

`remove_sampled_alphas(edge)` `staticmethod`

`sample_alphas(edge, tau)` `staticmethod`

`step(data_train, data_val)`

`update_ops(edge)` `staticmethod`