MAFixedwingDogfightEnvV2
#
Task Description#
This is a reinforcement learning environment for training AI agents to perform aerial dogfighting.
Usage#
from PyFlyt.pz_envs import MAFixedwingDogfightEnvV2
env = MAFixedwingDogfightEnvV2(render_mode="human")
observations, infos = env.reset()
while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
Environment Rules#
This is a cannons only environment. Meaning there are no missiles. An agent has to point its nose directly at the enemy for it to be considered a hit.
The gun is only effective within
lethal range
. Outside of this range, the gun deals no damage.The gun automatically fires when it can, there is no action for the agent to fire the weapon. This is similar to many fire control systems on modern aircraft.
An agent loses if it: a) Hits anything b) Flies out of bounds c) Loses all its health
Environment Options#
- class PyFlyt.pz_envs.fixedwing_envs.ma_fixedwing_dogfight_env.MAFixedwingDogfightEnv(team_size: int = 2, spawn_min_radius: float = 10.0, spawn_max_radius: float = 50.0, spawn_min_height: float = 20.0, spawn_max_height: float = 50.0, damage_per_hit: float = 0.003, lethal_distance: float = 20.0, lethal_angle_radians: float = 0.07, assisted_flight: bool = True, aggressiveness: float = 0.5, cooperativeness: float = 0.5, sparse_reward: bool = False, flatten_observation: bool = True, flight_dome_size: float = 800.0, max_duration_seconds: float = 60.0, agent_hz: int = 30, render_mode: None | str = None)#
Team Dogfighting Environment for the Acrowing model using the PettingZoo API.
- Parameters:
team_size (int) – number of planes that comprises a team.
spawn_min_radius (float) – agents are spawned in a circle pointing outwards, this value is the min radius of that circle.
spawn_max_radius (float) – agents are spawned in a circle pointing outwards, this value is the maxradius of that circle.
spawn_min_height (float) – minimum height to spawn the agents at the beginning of the simulation.
spawn_max_height (float) – maximum height to spawn the agents at the beginning of the simulation.
damage_per_hit (float) – how much damage per hit per physics step, each agent starts with a health of 1.0.
lethal_distance (float) – how close before weapons become effective.
lethal_angle_radians (float) – the width of the cone of fire.
too_close_distance (float) – the minimum distance that a drone must maintain from another drone before a penalty is incurred.
assisted_flight (bool) – whether to use high level commands (RPYT) instead of full actuator commands.
aggressiveness (float) – a value between 0 and 1 controlling how greedy the reward function is. Lower values lead to greedier policies.
cooperativeness (float) – a value between 0 and 1 controlling how cooperative with each other the reward function is.
sparse_reward (bool) – whether to use sparse rewards or not.
flight_dome_size (float) – size of the allowable flying area.
max_duration_seconds (float) – maximum simulation time of the environment.
agent_hz (int) – looprate of the agent to environment interaction.
render_mode (None | str) – can be “human” or None