CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Aircraft-to-Aircraft separation based on Reinforcement Learning


Estudiants que han llegit aquest projecte:


Director/a: BARRADO MUXÍ, CRISTINA

Departament: DAC

Títol: Aircraft-to-Aircraft separation based on Reinforcement Learning

Data inici oferta: 08-02-2022     Data finalització oferta: 08-10-2022



Estudis d'assignació del projecte:
    GR ENG SIST AEROESP
Tipus: Individual
 
Lloc de realització: EETAC
 
Paraules clau:
Separation, reinforcement learning, air traffic
 
Descripció del contingut i pla d'activitats:
Air traffic controllers (ATC) are today the mean to support separation between aircraft in managed airspace. As a human centered task it is limited in scalability. Capacity is the threshold of the number of aircraft that an ATC can safely manage.

Automation is seen as the future solution to increase capacity beyond the current limits. Machine learning algorithms are usied extensivelly to provide automation in many areas. Reinforcement learning is a machine learning technique in which the experiences are created by learning agents who intertact with a simulator.

In this project the student will design a reward function to be optimized by the policy algorithm of several identical agents, automated using reinforcement learning. The work will include as well the testing of different policy algorithms, such as Proximal Policy Optimization, Actor-Attention-Critic or Deep Coordination Graphs, to check their convergence over an air traffic simulator.

As an input to the policy algorithm, in addition to the state of the simulator, agents may have messages from the other agents close to them, acting collaboratively in avoiding loss of separation.

The comparison between algorithms will generate several indicators, such as the number of separation misses, the number of near miss collisions, the total distance of the flights, etc.



 
Overview (resum en anglès):

Air traffic has been increasing and with it the workload of air traffic controllers. Despite the pandemic, the latest figures show a rapid recovery and forecast exponential growth. This indicates the need to modernise air traffic control and the technology used, which is already being developed and implemented by organisations like SESAR, like applying AI to air traffic control (DART). A support tool with automatic conflict avoidance would be a great step to address the problem of possible overcapacity of air traffic controllers.

This document describes two possible implementations of a conflict avoidance tool. The approach is to use Deep Reinforcement Learning to select actions that avoid conflict and help the air traffic controllers to take faster and better decisions. The basis for both approaches is a simple 2D airspace simulator and the same policy applied to all the aircraft.

The first proposal is a stand-alone DQN algorithm (DRL) that has a 7.06% improvement in the number of simultaneous conflicts compared to the original environment without applying a policy.

The second approach is a DQN algorithm that incorporates transfer learning of the rules of the air, and it is called by the acronym DRLT. It resulted in a degradation compared to the original environment, with a 6% increase in unremembered conflicts.

Nevertheless, Deep Reinforcement Learning has shown a decrease in decision time and the idea of reusing the same strategy for all aircraft has solved the problem of unpredictability issue that some reinforcement learning solutions had. The proposal could be a good start for a self-separation tool for unmanned aircraft but still needs future improvements in results. It is not suitable for air traffic controllers or piloted vehicles due to the increased workload it would suppose.


© CBLTIC Campus del Baix Llobregat - UPC