MAFixedwingDogfightEnvV2#

https://raw.githubusercontent.com/jjshoots/PyFlyt/master/readme_assets/fixedwing_dogfight.gif

Task Description#

This is a reinforcement learning environment for training AI agents to perform aerial dogfighting.

Usage#

from PyFlyt.pz_envs import MAFixedwingDogfightEnvV2

env = MAFixedwingDogfightEnvV2(render_mode="human")
observations, infos = env.reset()

while env.agents:
    # this is where you would insert your policy
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}

    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()

Environment Rules#

  • This is a cannons only environment. Meaning there are no missiles. An agent has to point its nose directly at the enemy for it to be considered a hit.

  • The gun is only effective within lethal range. Outside of this range, the gun deals no damage.

  • The gun automatically fires when it can, there is no action for the agent to fire the weapon. This is similar to many fire control systems on modern aircraft.

  • An agent loses if it: a) Hits anything b) Flies out of bounds c) Loses all its health

Environment Options#

class PyFlyt.pz_envs.fixedwing_envs.ma_fixedwing_dogfight_env.MAFixedwingDogfightEnv(team_size: int = 2, spawn_min_radius: float = 10.0, spawn_max_radius: float = 50.0, spawn_min_height: float = 20.0, spawn_max_height: float = 50.0, damage_per_hit: float = 0.003, lethal_distance: float = 20.0, lethal_angle_radians: float = 0.07, assisted_flight: bool = True, aggressiveness: float = 0.5, cooperativeness: float = 0.5, sparse_reward: bool = False, flatten_observation: bool = True, flight_dome_size: float = 800.0, max_duration_seconds: float = 60.0, agent_hz: int = 30, render_mode: None | str = None)#

Team Dogfighting Environment for the Acrowing model using the PettingZoo API.

Parameters:
  • team_size (int) – number of planes that comprises a team.

  • spawn_min_radius (float) – agents are spawned in a circle pointing outwards, this value is the min radius of that circle.

  • spawn_max_radius (float) – agents are spawned in a circle pointing outwards, this value is the maxradius of that circle.

  • spawn_min_height (float) – minimum height to spawn the agents at the beginning of the simulation.

  • spawn_max_height (float) – maximum height to spawn the agents at the beginning of the simulation.

  • damage_per_hit (float) – how much damage per hit per physics step, each agent starts with a health of 1.0.

  • lethal_distance (float) – how close before weapons become effective.

  • lethal_angle_radians (float) – the width of the cone of fire.

  • too_close_distance (float) – the minimum distance that a drone must maintain from another drone before a penalty is incurred.

  • assisted_flight (bool) – whether to use high level commands (RPYT) instead of full actuator commands.

  • aggressiveness (float) – a value between 0 and 1 controlling how greedy the reward function is. Lower values lead to greedier policies.

  • cooperativeness (float) – a value between 0 and 1 controlling how cooperative with each other the reward function is.

  • sparse_reward (bool) – whether to use sparse rewards or not.

  • flight_dome_size (float) – size of the allowable flying area.

  • max_duration_seconds (float) – maximum simulation time of the environment.

  • agent_hz (int) – looprate of the agent to environment interaction.

  • render_mode (None | str) – can be “human” or None