CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer


Estudiants que han llegit aquest projecte:


Director/a: BARRADO MUXÍ, CRISTINA

Departament: DAC

Títol: Applying and investigating state-of-the-art Reinforcement Learning methods to autonomous UAV tasks in a simulated environment with view on real world transfer

Data inici oferta: 06-07-2017     Data finalització oferta: 06-03-2018



Estudis d'assignació del projecte:
    Tipus: Individual
     
    Lloc de realització: EETAC
     
    Paraules clau:
    reinforcement learning, deep learning, autonomous UAV, optimal control, neural networks, simulation
     
    Descripció del contingut i pla d'activitats:
    Airport Collaborative Decision Making (A-CDM) is based on information
    sharing. A better use of resources can be attained when the different
    stakeholders at airport operations share their more accurate and
    updated information. One of the main difficulties when dealing with
    this information sharing concept is the number of stakeholders
    involved and their different interest and behaviour: aircraft operators,
    ground handling companies, airport authority, air traffic control and
    the Central Flow Management Unit. It is paramount to quantify the
    benefit of an airport collaborative decision making strategy in order to
    involve all these different organisations. Simulations are required to
    analyse the overall system and its emerging behaviour. In this project,
    an agent-based simulator for A-CDM will be developed. The simulator
    will represent the different stakeholders involved in the A-CDM and
    the interactions between them during the 16 milestones defined by
    EUROCONTROL on its A-CDM implementation manual. This framework
    should allow independent gradual development of local behaviours and
    optimisation, and a gradual increase on complexity and fidelity on the
    simulations.
     
    Overview (resum en anglès):
    Deep Reinforcement Learning (DRL) is attracting increasing interest due to its ability to learn how to solve complex tasks in an unknown environment solely by gathering experience. In this thesis, we investigate the use of DRL methods on the vision-based control of an autonomous quadcopter within a simulated environment. More specifically we employ an algorithm called Deep Q-network and two extensions involving the concept of Double Q-learning and Dueling Architecture. To evaluate the algorithms, we create a challenging task that concern obstacle avoidance and goal position reaching. Due to the lack of available tools that would combine the simulation of drones and the accessibility of DRL methods, we contribute AirGym as a framework that offers a convenient implementation of our task an these of following researchers. The results of the study support the idea of full control of an autonomous drone through DRL methods since we achieved an 80% success rate in solving the task under a near human-level of performance. This achievement is enhanced by considering the relatively short training time and the identification of further improvements.


    © CBLTIC Campus del Baix Llobregat - UPC