Multi-Agent DAC¶
As Xue et al. have shown, multiple controllers collaborating to configure a single hyperparameter of the same algorithm each is a promising approach for solving DAC. To support further innovation in that direction, all of our environments with multiple configurable hyperparameters can be used as a Multi-Agent version. This allows users to specify hyperparameters one by one instead of in a single step and thus is especially useful for those interfacing existing libraries.
In order to create a Multi-Agent DACBench environment, select either of the following benchmarks:
FunctionApproximation (Artificial Benchmark): Function approximation in multiple dimensions.
ToySGD (Artificial Benchmark): Controlling the learning rate in gradient descent.
CMA-ES: Step-size & algorithm component control for EAs backed by IOHProfiler.
To instantiate a benchmark environment, first set the ‘multi_agent’ key in the configuration to True and then create the environment as usual:
from dacbench.benchmarks import FunctionApproximationBenchmark
bench = FunctionApproximationBenchmark()
bench.config["multi_agent"] = True
env = bench.get_environment()
Running the benchmark is similar, but not quite the same as running a normal DACBench environment. First, you need to register the agents. Note that for this application, it makes sense to use an agent per hyperparameter even though it’s technically possible to register less agents. The remaining hyperparameters will be randomly, sampled, however, which could lead to adversarial effects. To register an agent, use the ID of the hyperparameter you want to control. If using ConfigSpace, you can also use the hyperparameter’s name:
from dacbench.agents import StaticAgent
Agent_zero = StaticAgent(env, env.action_spaces[0].sample())
Agent_one = StaticAgent(env, env.action_spaces[1].sample())
agents = [Agent_zero, Agent_one]
env.register_agent(0)
env.register_agent(1)
The episode loop is slightly different as well:
env.reset()
for agent in agents:
observation, reward, terminated, truncated, info = env.last()
action = agent.act(state, reward)
env.step(action)
For more information on this interface, consult the PettingZoo Documentation on which our interface is based.
Abstract Environment.
- class dacbench.abstract_env.AbstractEnv(config)[source]¶
Bases:
ABC
,Env
Abstract template for environments.
- abstract reset(seed: int | None = None)[source]¶
Reset environment.
- Parameters:
seed – Seed for the environment
Returns
--------
state – Environment state
info (dict) – Additional metainfo
- reset_(seed=0, options=None, instance=None, instance_id=None, scheme=None)[source]¶
Pre-reset function for progressing through the instance set. Will either use round robin, random or no progression scheme.
- seed(seed=None, seed_action_space=False)[source]¶
Set rng seed.
- Parameters:
seed – seed for rng
seed_action_space (bool, default False) – if to seed the action space as well
- set_inst_id(inst_id)[source]¶
Change current instance ID.
- Parameters:
inst_id (int) – New instance index
- set_instance_set(inst_set)[source]¶
Change instance set.
- Parameters:
inst_set (list) – New instance set
- abstract step(action)[source]¶
Execute environment step.
- Parameters:
action – Action to take
Returns
--------
state – Environment state
reward – Environment reward
terminated (bool) – Run finished flag
truncated (bool) – Run timed out flag
info (dict) – Additional metainfo
- use_next_instance(instance=None, instance_id=None, scheme=None)[source]¶
Changes instance according to chosen instance progession.
- Parameters:
instance – Instance specification for potentional new instances
instance_id – ID of the instance to switch to
scheme – Update scheme for this progression step (either round robin, random or no progression)
- class dacbench.abstract_env.AbstractMADACEnv(config)[source]¶
Bases:
AbstractEnv
Multi-Agent version of DAC environment.
- property agent_selection¶
Current agent.
- property infos¶
Current infos per agent.
- multi_agent_reset(seed: int | None = None)[source]¶
Reset env, but don’t return observations.
- Parameters:
seed (int) – seed to use
- multi_agent_step(action)[source]¶
Step for a single hyperparameter.
- Parameters:
action – the action in the current agent’s dimension
- property num_agents¶
Current number of agents.
- remove_agent(agent_id)[source]¶
Remove agent.
- Parameters:
agent_id (int) – id of the agent to remove
- property rewards¶
Current rewards values per agent.
- property terminations¶
Current termination values per agent.
- property truncations¶
Current truncation values per agent.