MAFixedwingDogfightEnv
#
Task Description#
This is a reinforcement learning environment for training AI agents to perform aerial dogfighting.
Usage#
from PyFlyt.pz_envs import MAFixedwingDogfightEnv
env = MAFixedwingDogfightEnv(render_mode="human")
observations, infos = env.reset()
while env.agents:
# this is where you would insert your policy
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
Environment Rules#
This is a cannons only environment. Meaning there are no missiles. An agent has to point its nose directly at the enemy for it to be considered a hit.
The gun is only effective within
lethal range
. Outside of this range, the gun deals no damage.The gun automatically fires when it can, there is no action for the agent to fire the weapon. This is similar to many fire control systems on modern aircraft.
An agent loses if it: a) Hits anything b) Flies out of bounds c) Loses all its health
Environment Options#
- class PyFlyt.pz_envs.fixedwing_envs.ma_fixedwing_dogfight_env.MAFixedwingDogfightEnv(spawn_height: float = 15.0, damage_per_hit: float = 0.02, lethal_distance: float = 15.0, lethal_angle_radians: float = 0.1, assisted_flight: bool = True, sparse_reward: bool = False, flight_dome_size: float = 150.0, max_duration_seconds: float = 60.0, agent_hz: int = 30, render_mode: None | str = None)#
Base Dogfighting Environment for the Acrowing model using the PettingZoo API.
Args:#
spawn_height (float): how high to spawn the agents at the beginning of the simulation. damage_per_hit (float): how much damage per hit per physics step, each agent starts with a health of 1.0. lethal_distance (float): how close before weapons become effective. lethal_angle_radians (float): the width of the cone of fire. assisted_flight (bool): whether to use high level commands (RPYT) instead of full actuator commands. sparse_reward (bool): whether to use sparse rewards or not. flight_dome_size (float): size of the allowable flying area. max_duration_seconds (float): maximum simulation time of the environment. agent_hz (int): looprate of the agent to environment interaction. render_mode (None | str): can be “human” or None