CBL - Campus del Baix Llobregat

Projecte llegit

Títol: Analysis and Evaluation of No-Reference Video Quality Assesment using Neural Networks


Estudiants que han llegit aquest projecte:


Director/a: TARRÉS RUIZ, FRANCESC

Departament: TSC

Títol: Analysis and Evaluation of No-Reference Video Quality Assesment using Neural Networks

Data inici oferta: 07-01-2020     Data finalització oferta: 07-09-2020



Estudis d'assignació del projecte:
    GR ENG SIS TELECOMUN
Tipus: Individual
 
Lloc de realització: EETAC
 
Paraules clau:
Video analysis, Video Quality Assesment, Deep Learning, Python, Machine Learning
 
Descripció del contingut i pla d'activitats:
The measure of the subjective quality of video is a difficult task that requires the
preparation of a large series of testings defining examples, groups of people, control
groups, etc. Therefore, some video analysis techniques have been developed in order to
predict the 'subjective' quality of a video. These techniques a divided in 2 groups:
reference and no-reference VQA algortihms, depending of the requirement of the original
video in the prediction algorithm. When the original video is not available the problem
becomes specially complex but it is of great interest in many applications.

In this thesis we will explore some strategies based on using deep learning in order to
obtain Video Quality Assessment without a Reference Signal. The main approach will be
to develope a neural architecture that will be trained using well known VQA methods such
as VMAF or other methods. Once trained, the system will have to predict the quality of
new videos without reference
 
Overview (resum en anglès):
The main goal of our project is: Analyze the performance of some networks, in order to see if it is capable of classifying, in terms of quality, different types of images.

In a very generalized and simplified way, in this project we seek to generate an algorithm that allows us to quantitatively assess the quality of a digital video. This problem will be tackled through the use of self-supervised convolutional neural networks (CNN).

This means that our main objective is to be able to generate or train a network whose main source of information is different video chunks, each one with a quality associated with a score.

We will obtain this score using a visual perception algorithm called VMAF (Video Multi-Method Assessment Fusion), developed and updated for years by one of the largest video-on-demand and streaming services companies in the world: Netflix.

We will mainly use Python and some specialized libraries in other languages such as Bash or C #, for the development of the vast majority of scripts on the data processing steps.

From the development of these networks in Keras environment, we will be able to apply different variations to already pre-conceived models existing in the platform.

We will use networks and pre-trained models; on which we will make modifications to be able to adapt them to our particular data flow. In our case it will be the VGG16 and ResNet50 networks

These networks will be trained in an AWS instance, due to its high computing cost for a conventional GPU.

Finally, we will evaluate its performance by analyzing different metrics, such as ¿accuracy¿ and losses in the training process (¿training loss¿).

In the final section of conclusions, possible improvements will be discussed and the final analysis of the thesis as a whole will be summarized in a synthesized way.


© CBLTIC Campus del Baix Llobregat - UPC