CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Voice Authentication using Python's Machine Learning and IBM Watson Speech to Text

Director/a: BARRADO MUXÍ, CRISTINA

Departament: DAC

Títol: Voice Authentication using Python's Machine Learning and IBM Watson Speech to Text

Data inici oferta: 14-01-2020 Data finalització oferta: 14-09-2020

Estudis d'assignació del projecte:

MU MASTEAM 2015

Tipus: Individual

Lloc de realització: EETAC

Paraules clau:
Machine Learning, Voice Recognition

Descripció del contingut i pla d'activitats:
Deep Reinforcement Learning (DRL) is a field of the Artificial Intelligence in which an agent (e.g. a drone) can learn to move in an environment by try-and-error. In the research group we have build a DRL model for a drone flying in a simulated environment. The objective of this drone is to intercept a second drone. The built drone has a high rate of successful episodes, but still some faulty ones. In this non-successful episodes the drone has an erratic behavior. Given the black-box nature of the neural networks it is very difficult to understand the causes of it. In this MsT the student will have to study and apply some of the existing techniques for the Explainability of Artificial Intelligence (XAI) to extract the knowledge of the model. For instance, with the visualization the state image using heat maps, or by using the tools available at www.explain-ai.org or at other AI platforms.

Overview (resum en anglès):
Voice authentication is seen to be a promising solution that can aid tedious traditional authentication systems where the individual must be physically present in the place of transaction. This work covers four scenarios to perform Voice Authentication i.e. the nrolment phase, authentication phase, model improvement and accuracy check. The latter two were designed to allow the author to perform system improvement through modification of parameters during the testing phase. The system is created with the use of Python, Librosa and IBM to record, process and save audio samples from users and generate a corresponding model from the several voice samples obtained.