Multi-Agent DAC

As Xue et al. have shown, multiple controllers collaborating to configure a single hyperparameter of the same algorithm each is a promising approach for solving DAC. To support further innovation in that direction, all of our environments with multiple configurable hyperparameters can be used as a Multi-Agent version. This allows users to specify hyperparameters one by one instead of in a single step and thus is especially useful for those interfacing existing libraries.

In order to create a Multi-Agent DACBench environment, select either of the following benchmarks:

  • FunctionApproximation (Artificial Benchmark): Function approximation in multiple dimensions.

  • ToySGD (Artificial Benchmark): Controlling the learning rate in gradient descent.

  • CMA-ES: Step-size & algorithm component control for EAs backed by IOHProfiler.

To instantiate a benchmark environment, first set the ‘multi_agent’ key in the configuration to True and then create the environment as usual:

from dacbench.benchmarks import FunctionApproximationBenchmark
bench = FunctionApproximationBenchmark()
bench.config["multi_agent"] = True
env = bench.get_environment()

Running the benchmark is similar, but not quite the same as running a normal DACBench environment. First, you need to register the agents. Note that for this application, it makes sense to use an agent per hyperparameter even though it’s technically possible to register less agents. The remaining hyperparameters will be randomly, sampled, however, which could lead to adversarial effects. To register an agent, use the ID of the hyperparameter you want to control. If using ConfigSpace, you can also use the hyperparameter’s name:

from dacbench.agents import StaticAgent

Agent_zero = StaticAgent(env, env.action_spaces[0].sample())
Agent_one = StaticAgent(env, env.action_spaces[1].sample())
agents = [Agent_zero, Agent_one]

env.register_agent(0)
env.register_agent(1)

The episode loop is slightly different as well:

env.reset()
for agent in agents:
    observation, reward, terminated, truncated, info = env.last()
    action = agent.act(state, reward)
    env.step(action)

For more information on this interface, consult the PettingZoo Documentation on which our interface is based.

Abstract Environment.

class dacbench.abstract_env.AbstractEnv(config)[source]

Bases: ABC, Env

Abstract template for environments.

get_inst_id()[source]

Return instance ID.

Returns:

int: ID of current instance

get_instance()[source]

Return current instance.

Returns:

type flexible: Currently used instance

get_instance_set()[source]

Return instance set.

Returns:

list: List of instances

abstract reset(seed: int | None = None)[source]

Reset environment.

Parameters:
  • seed – Seed for the environment

  • Returns

  • --------

  • state – Environment state

  • info (dict) – Additional metainfo

reset_(seed=0, options=None, instance=None, instance_id=None, scheme=None)[source]

Pre-reset function for progressing through the instance set. Will either use round robin, random or no progression scheme.

seed(seed=None, seed_action_space=False)[source]

Set rng seed.

Parameters:
  • seed – seed for rng

  • seed_action_space (bool, default False) – if to seed the action space as well

set_inst_id(inst_id)[source]

Change current instance ID.

Parameters:

inst_id (int) – New instance index

set_instance(instance)[source]

Change currently used instance.

Parameters:

instance – New instance

set_instance_set(inst_set)[source]

Change instance set.

Parameters:

inst_set (list) – New instance set

abstract step(action)[source]

Execute environment step.

Parameters:
  • action – Action to take

  • Returns

  • --------

  • state – Environment state

  • reward – Environment reward

  • terminated (bool) – Run finished flag

  • truncated (bool) – Run timed out flag

  • info (dict) – Additional metainfo

step_()[source]

Pre-step function for step count and cutoff.

Returns:

bool: End of episode

use_next_instance(instance=None, instance_id=None, scheme=None)[source]

Changes instance according to chosen instance progession.

Parameters:
  • instance – Instance specification for potentional new instances

  • instance_id – ID of the instance to switch to

  • scheme – Update scheme for this progression step (either round robin, random or no progression)

use_test_set()[source]

Change to test instance set.

use_training_set()[source]

Change to training instance set.

class dacbench.abstract_env.AbstractMADACEnv(config)[source]

Bases: AbstractEnv

Multi-Agent version of DAC environment.

property agent_selection

Current agent.

property infos

Current infos per agent.

last()[source]

Get current step data.

Returns:

np.array, float, bool, bool, dict

multi_agent_reset(seed: int | None = None)[source]

Reset env, but don’t return observations.

Parameters:

seed (int) – seed to use

multi_agent_step(action)[source]

Step for a single hyperparameter.

Parameters:

action – the action in the current agent’s dimension

property num_agents

Current number of agents.

register_agent(agent_id)[source]

Add agent.

Parameters:

agent_id (int) – id of the agent to add

remove_agent(agent_id)[source]

Remove agent.

Parameters:

agent_id (int) – id of the agent to remove

property rewards

Current rewards values per agent.

property terminations

Current termination values per agent.

property truncations

Current truncation values per agent.