CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer

Estudiants que han llegit aquest projecte:
KERSANDT, KJELL (data lectura: 06-02-2018)
Cerca aquest projecte a Bibliotècnica

Director/a: BARRADO MUXÍ, CRISTINA

Departament: DAC

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer

Data inici oferta: 06-07-2017 Data finalització oferta: 06-03-2018

Estudis d'assignació del projecte:

Tipus: Individual

Lloc de realització: EETAC

Paraules clau:
reinforcement learning, deep learning, autonomous UAV, optimal control, neural networks, simulation

Descripció del contingut i pla d'activitats:
Airport Collaborative Decision Making (A-CDM) is based on information sharing. A better use of resources can be attained when the different stakeholders at airport operations share their more accurate and updated information. One of the main difficulties when dealing with this information sharing concept is the number of stakeholders involved and their different interest and behaviour: aircraft operators, ground handling companies, airport authority, air traffic control and the Central Flow Management Unit. It is paramount to quantify the benefit of an airport collaborative decision making strategy in order to involve all these different organisations. Simulations are required to analyse the overall system and its emerging behaviour. In this project, an agent-based simulator for A-CDM will be developed. The simulator will represent the different stakeholders involved in the A-CDM and the interactions between them during the 16 milestones defined by EUROCONTROL on its A-CDM implementation manual. This framework should allow independent gradual development of local behaviours and optimisation, and a gradual increase on complexity and fidelity on the simulations.

Overview (resum en anglès):
Deep Reinforcement Learning (DRL) is attracting increasing interest due to its ability to learn how to solve complex tasks in an unknown environment solely by gathering experience. In this thesis, we investigate the use of DRL methods on the vision-based control of an autonomous quadcopter within a simulated environment. More specifically we employ an algorithm called Deep Q-network and two extensions involving the concept of Double Q-learning and Dueling Architecture. To evaluate the algorithms, we create a challenging task that concern obstacle avoidance and goal position reaching. Due to the lack of available tools that would combine the simulation of drones and the accessibility of DRL methods, we contribute AirGym as a framework that offers a convenient implementation of our task an these of following researchers. The results of the study support the idea of full control of an autonomous drone through DRL methods since we achieved an 80% success rate in solving the task under a near human-level of performance. This achievement is enhanced by considering the relatively short training time and the identification of further improvements.