CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Analysis and Evaluation of No-Reference Video Quality Assesment using Neural Networks

Estudiants que han llegit aquest projecte:
ARNOSO PÉREZ, ROBERTO (data lectura: 14-07-2021)
Cerca aquest projecte a Bibliotècnica

Director/a: TARRÉS RUIZ, FRANCESC

Departament: TSC

Títol: Analysis and Evaluation of No-Reference Video Quality Assesment using Neural Networks

Data inici oferta: 07-01-2020 Data finalització oferta: 07-09-2020

Estudis d'assignació del projecte:

GR ENG SIS TELECOMUN

Tipus: Individual

Lloc de realització: EETAC

Paraules clau:
Video analysis, Video Quality Assesment, Deep Learning, Python, Machine Learning

Descripció del contingut i pla d'activitats:
The measure of the subjective quality of video is a difficult task that requires the preparation of a large series of testings defining examples, groups of people, control groups, etc. Therefore, some video analysis techniques have been developed in order to predict the 'subjective' quality of a video. These techniques a divided in 2 groups: reference and no-reference VQA algortihms, depending of the requirement of the original video in the prediction algorithm. When the original video is not available the problem becomes specially complex but it is of great interest in many applications. In this thesis we will explore some strategies based on using deep learning in order to obtain Video Quality Assessment without a Reference Signal. The main approach will be to develope a neural architecture that will be trained using well known VQA methods such as VMAF or other methods. Once trained, the system will have to predict the quality of new videos without reference

Overview (resum en anglès):
The main goal of our project is: Analyze the performance of some networks, in order to see if it is capable of classifying, in terms of quality, different types of images. In a very generalized and simplified way, in this project we seek to generate an algorithm that allows us to quantitatively assess the quality of a digital video. This problem will be tackled through the use of self-supervised convolutional neural networks (CNN). This means that our main objective is to be able to generate or train a network whose main source of information is different video chunks, each one with a quality associated with a score. We will obtain this score using a visual perception algorithm called VMAF (Video Multi-Method Assessment Fusion), developed and updated for years by one of the largest video-on-demand and streaming services companies in the world: Netflix. We will mainly use Python and some specialized libraries in other languages such as Bash or C #, for the development of the vast majority of scripts on the data processing steps. From the development of these networks in Keras environment, we will be able to apply different variations to already pre-conceived models existing in the platform. We will use networks and pre-trained models; on which we will make modifications to be able to adapt them to our particular data flow. In our case it will be the VGG16 and ResNet50 networks These networks will be trained in an AWS instance, due to its high computing cost for a conventional GPU. Finally, we will evaluate its performance by analyzing different metrics, such as ¿accuracy¿ and losses in the training process (¿training loss¿). In the final section of conclusions, possible improvements will be discussed and the final analysis of the thesis as a whole will be summarized in a synthesized way.