CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer


Estudiants que han llegit aquest projecte:


Director/a: BARRADO MUXÍ, CRISTINA

Departament: DAC

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer

Data inici oferta: 06-07-2017     Data finalització oferta: 06-03-2018



Estudis d'assignació del projecte:
    MU AEROSPACE S&T 15
Tipus: Individual
 
Lloc de realització: EETAC
 
Paraules clau:
reinforcement learning, deep learning, autonomous UAV, optimal control, neural networks, simulation
 
Descripció del contingut i pla d'activitats:
Airport Collaborative Decision Making (A-CDM) is based on information
sharing. A better use of resources can be attained when the different
stakeholders at airport operations share their more accurate and
updated information. One of the main difficulties when dealing with
this information sharing concept is the number of stakeholders
involved and their different interest and behaviour: aircraft operators,
ground handling companies, airport authority, air traffic control and
the Central Flow Management Unit. It is paramount to quantify the
benefit of an airport collaborative decision making strategy in order to
involve all these different organisations. Simulations are required to
analyse the overall system and its emerging behaviour. In this project,
an agent-based simulator for A-CDM will be developed. The simulator
will represent the different stakeholders involved in the A-CDM and
the interactions between them during the 16 milestones defined by
EUROCONTROL on its A-CDM implementation manual. This framework
should allow independent gradual development of local behaviours and
optimisation, and a gradual increase on complexity and fidelity on the
simulations.
 
Overview (resum en anglès):
Deep Reinforcement Learning (DRL) is attracting increasing interest due to its ability to learn how to solve complex tasks in an unknown environment solely by gathering experience. In this thesis, we investigate the use of DRL methods on the vision-based control of an autonomous quadcopter within a simulated environment. More specifically we employ an algorithm called Deep Q-network and two extensions involving the concept of Double Q-learning and Dueling Architecture. To evaluate the algorithms, we create a challenging task that concern obstacle avoidance and goal position reaching. Due to the lack of available tools that would combine the simulation of drones and the accessibility of DRL methods, we contribute AirGym as a framework that offers a convenient implementation of our task an these of following researchers. The results of the study support the idea of full control of an autonomous drone through DRL methods since we achieved an 80% success rate in solving the task under a near human-level of performance. This achievement is enhanced by considering the relatively short training time and the identification of further improvements.


© CBLTIC Campus del Baix Llobregat - UPC