Student Projects

PROJECT PROPOSALS 2022-2023

If you are interested in taking a project in our group, please contact the responsible person under the detailed description of the project that you would like to choose.


Image compression for DNA based storage 

DNA can be used for storage of information the same way the genetic codes of most living entities, including humans, are stored in their DNA. There are several advantages behind such an approach, such as a much higher storage density, long term preservation capability and better energy efficiency. The underlying information in DNA is represented in a quaternary code (AGCT) instad of a binary code (01). This calls for completely new approaches to efficiently code information in a DNA compatible manner. 

The goal of this project is to study alternatives approaches proposed in the state of the art to store informaiton in DNA and to come up with an end-to-end image compression simulator by taking advantage of publicly accessible implementations.  

The following tasks should be performed during the project:

  • Study the relevant state of the art relevant in DNA storage and coding.
  • Identify existing source code for DNA storage and analyse them.
  • Design and implement a simulator of image compression for DNA storage based on state of the art implementations  
  • Analyse the performance of the simulator.

Requirements: Basic knowledge of signal and image processing. Good programming skills.

Contact: Touradj Ebrahimi

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, Digital Humanities, Mechanical Engineering, Micro Engineering, Management of Technology and/or equivalent.

Number of students: One


Non-fungible tokens (NFTs) for management of image assets 

Non-fungible tokens (NFTs) have become a hot topic in digital assets ownership management with a wide tange of applications ranging from the trade of electronic art to micro-licensing of digital assets. One of the most popular types of digital assets in NFT-based applications is image assets. But current practices in NFTs do not take the structure of digital images into account, which results in inefficient and even untrustworthy solutions.  

The goal of this project is to study the state of the art in creation and management of NFTs for image assets, to identify their weaknesses, and to propose solutions and implement them in proof of concept fashion, showing their advantages.  

The following tasks should be performed during the project:

  • Study the relevant state of the art relevant to NFTs and curent best practices.
  • Identify weaknesses of the current solutions.
  • Design and implement solutions to cope with some of the identified weaknesses   
  • Analyse the proposed solutions and compare to state of the art and existing practices.

Requirements: Good programming skills. 

Contact: Touradj Ebrahimi

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, Digital Humanities, Mechanical Engineering, Micro Engineering, Management of Technology and/or equivalent.

Number of students: One


ProCam – Privacy-friendly infrared camera

Many contagious diseases such as Covid19 can induce high temperatures and fever in a significant number of affected individuals. Contact tracing of such individuals and analysis of their behavior and interactions with others and their environment can be a useful tool to contain the spread of contagious diseases. This is in particular useful for back-to-work strategy as well as protection of critical personnel or more vulnerable individuals. Existing solutions are either too expensive (e.g. high-end thermal cameras) or not precise enough (e.g. contact tracing using smartphones). Also, surveillance and analysis of their content pose various ethical challenges, including invasion of privacy.

ProCam is an interdisciplinary project supported by EPFL and federates Bachelor, Master, and Doctoral students of EPFL. Students can elect to ProCam as part of their semester and final projects and receive credits and will be supervised by an EPFL student or researcher.

The objective of ProCam is to design, build and test a new type of connected camera with multiple sensors that can track people while protecting their privacy and at the same time identify those with high body temperature. The device is in form of a kitset made of off-shelf components and open software that will allow anybody to build it at low cost, easily install and start using it. A dedicated server records all captured footage in a secure and anonymized way with the possibility of further analysis, visualization, and eventual de-anonymization. An early prototype of the camera system with an enclosure to protect its components has been created.

The following tasks should be performed during the project:

  • Study the existing ProCam prototype as well as relevant state of the art relevant to the project.
  • With the help of other students update the prototype, implement, and test.
  • Analyze the characteristics of the camera and its strengths. 
  • Propose improvements for next-generation cameras.

Requirements: Because of the interdisciplinary nature of the project any background is welcome.

Contact: Touradj Ebrahimi

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, Digital Humanities, Mechanical Engineering, Micro Engineering, Management of Technology and/or equivalent.

Number of students: Group of students in collaboration with senior researchers


Image de-noising in the latent representation of JPEG AI

 

Deep-learning based image compression algorithms are becoming popular, showing excellent performance in terms of compression efficiency and perceived visual quality. The most popular approach in deep-learning based image compression is through autoencoders, which are neural networks capable of mapping an input image in the pixel domain to a compact representation in a latent space. Consequently, another network reconstructs the original image in the pixel domain from its latent space, as accurately as possible.  In this context, JPEG is currently going through the collaborative process for the development of a learning-based compression engine named JPEG AI (https://jpeg.org/jpegai/documentation.html).

Traditionally, image processing algorithms are applied to the reconstructed images in the pixel domain. Recently, researchers have been attempting to apply these algorithms in the latent space, i.e., before the decoding, reducing the computational cost while achieving the same or even better performance in accuracy. In particular, this project aims at analyzing the problem of image de-noising in latent space of the JPEG AI method.

The objective of the project is to first familiarize the student with the JPEG AI codec and with the most popular deep-learning based methods for image denoising. Then the student will explore the problem of image denoising in the latent space of JPEG AI, implementing a suitable solution.

The following tasks will be performed during the project:

  • Review the JPEG AI activity and the current verification model
  • Study the state of the art of deep-learning based image denoising followed by strategies for integrating denoising into learning-based image compression
  • Collect a suitable dataset and apply synthetic but realistic noise 
  • Propose and implement a solution by merging the JPEG AI decoder with the denoising network, and training the model
  • Evaluate the above and report the results
  • Document all the development process and source code

Requirements: Background in image processing and deep learning. Good skills in Python programming.

Contact: Michela Testolina

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Objective quality assessment of compressed images in the high to nearly visually lossless quality range

Advances in digital cameras, broadband internet, and display technologies have made high-quality imaging feasible, desirable, and more accessible than ever before, opening up new possibilities for creative expression and scientific discovery. Image compression is essential to limit storage resources, which have been constantly increasing over the years. Depending on the coding algorithm and desired compression ratio, image compression might introduce visible and undesirable artifacts to images, reducing their perceived visual quality. Recent image compression methods attempt to reach a high compression ratio without significantly compromising the visual quality of the reconstructed images.

To help the research towards such approaches, distinct subjective tests have been proposed and standardized. Anyway, those methods are known for being expensive and time-consuming. To solve this problem, an objective image quality metric able to accurately predict the subjective opinion should be designed in the specific use case of image compression.

The objective of this project is to first familiarize the student with the state of the art in learning-based objective image quality assessment, identifying the most suitable method for this project. Then the student will collect a dataset of images and related subjective visual quality scores suitable for training a neural network. Finally, the student is expected to perform re-training of the chosen learning-based objective quality assessment metric, and evaluate its performance in comparison to the state-of-the-art methods.

The following tasks will be performed during the project:

  • Study the state of the art on objective image quality assessment. After a detailed analysis, select a learning-based solution suitable for this project
  • Collect a suitable dataset of images 
  • Retrain the identified learning-based objective quality metrics on the collected dataset
  • Evaluate the performance of the proposed solution
  • Document all the development process and source code

Requirements: Background on deep-learning image compression and image quality assessment. Good skills in Python programming.

Contact: Michela Testolina

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One


Web platform for subjective image quality assessment in the in the high to nearly visually lossless quality range

Advances in digital cameras, broadband internet, and display technologies have made high-quality imaging feasible, desirable, and more accessible than ever before, opening up new possibilities for creative expression and scientific discovery. Image compression is essential in order to limit storage resources, which have been constantly increasing over the years. Depending on the coding algorithm and desired compression ratio, image compression might introduce visible and undesirable artifacts to images, reducing their perceived visual quality. Recent image compression methods attempt to reach a high compression ratio without significantly compromising the visual quality of the reconstructed images. 

To help the research towards such approaches, specific subjective tests have been proposed and standardized. However, recent experiments showed that these standardized methods are not robust and suitable for the high to nearly visually lossless quality range. In this context, the JPEG AIC-3 activity is working towards standardizing subjective quality metrics reliable in this quality range. A stable and scalable web-based platform should be designed to help research in this field.

The goal of this project is to develop a web platform suitable for conducting subjective image quality assessment experiments in a crowdsourcing environment. The student will first review the JPEG AIC-3 activity. Then the student will review the available web platform for subjective image quality assessment available online. Then the student will implement a new web platform following the JPEG AIC-3 guidelines. The platform will then be validated with a crowdsourcing experiment that will be conducted by the student.

The following tasks will be performed during the project:

  • Review the JPEG AIC-3 activity
  • Review the available web platform for subjective quality assessment available online
  • Design and implement a new web platform following the JPEG AIC-3 guidelines
  • Validate the proposed web platform with a crowdsourcing experiment
  • Document all the development process and source code

Requirements: Good skills in programming. Experience with web development.

Contact: Michela Testolina

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Light-field compression using conventional and learning-based methods

In recent years, new emerging imaging methods, e.g. light fields, are becoming popular. From the physics point of view, light fields measure the light coming from every direction in every point in space, and are often collected through multi-camera arrays or with plenoptic cameras like Lytro and Raytrix.

As light field imaging has gained its popularity only in recent years, its state of the art on compression is not as mature as the one of conventional images and videos. While a number of conventional codecs have been proposed and implemented, the research on learning-based light field compression is still at its initial stage.

The objective of the project is to first familiarize the student with the most recent methods for light field compression, both conventional and learning-based. Then the student will compare different methods, exploring the advantages and disadvantages of each through an objective and subjective analysis.

The following tasks should be performed during the project:

  • Study the state of the art of conventional and deep-learning based light field compression, followed by identification of the best methods to use during the project.
  • Collect a suitable dataset of light field images.
  • Review the available implementations of the compression methods studied above. Run the code and generate compressed light fields at different bitrates.
  • Evaluate the different methods objectively through rate-distortion plots, and subjectively.
  • Report the results of the above and present the conclusions.

Requirements: Background on image and video processing or computer vision. Good programming skills (Matlab or Python).

Contact: Michela Testolina

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Deep learning for deepfake detection

Due to the increasing spread of doctored or synthetic contents on the Internet and their impact on the dissemination of fake news over social networks, detecting manipulated content has become a major challenge in both academic and professional communities. Major companies have joined forces to organize challenges with the goal of helping in the process of creating widely accessible tools and solutions to detect malicious modifications of multimedia contents.

One of the most important and recent actions was the Deepfake Detection Challenge organized by Facebook and Microsoft, with the involvement of many academic research groups. The organizers hoped that this challenge would result in new technologies for detecting AI-generated videos which can later be used on social networking platforms and/or by journalists. This illustrates the major concerns of large companies about the danger of AI-assisted content manipulations. 

In this project, we tackle the deepfake detection problem by training several convolutional neural networks (CNNs) in a supervised fashion. Finally, the ensembling of different trained CNNs will be studied.

In particular, two main objectives will be pursued in this project. The first aims at finding existing and publicly available deepfake datasets. The second aims at training deep neural networks using the above datasets for the task of deepfake detection. The following tasks should be performed by the student:

  • Review the state of the art deepfake detection methods 
  • Study the state of the art deepfake creation methods and find/generate their corresponding dataset which can further be used for training of CNNs.
  • Run/Adapt/Create a program to detect deepfake images and videos
  • Investigate the most common performance metrics
  • Assess the performance of the trained models against several datasets
  • Document the code and write a report on the project

Requirements: Good skills in Python programming. Background in deep learning and image processing.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One


Exploring Robust Deepfake Detection Methods

In recent years, face manipulation techniques, in particular with Deepfake methods, have raised great public concerns. The deep learning-based tools and open-source software have simplified the creation of such manipulated content. It is crucial to develop Deepfake detection systems that can automatically and effectively identify manipulated videos and images. Although the performance of current detectors shows a rather high accuracy on well-known datasets, it often drastically degrades when they are tested in non-trivial situations with real-world perturbations. A more robust Deepfake detector therefore is desired. 

The objective of this project is to first familiarize the student with the state-of-the-art deep learning-based Deepfake detection methods and databases. Then the student will explore how to improve network robustness from the following aspects: training strategy, data augmentation, and neural network architecture. The student is also encouraged to explore other influencing factors from different angles. At the end, the student will evaluate an improved detection method with a more realistic Deepfake detection assessment framework.

In general, the following tasks shall be performed by the student:

  • Investigate the state-of-the-art deep learning-based Deepfake detection methods
  • Review popular Deepfake detection databases
  • Explore better architectures or data augmentation techniques to improve the robustness of Deepfake detector
  • Assess the performance of a proposed and trained detector with the improved Deepfake detection assessment framework
  • Document the code and results and write a report on the project

Requirements: Background in image processing and deep learning. Good skills in Python programming.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One


Identity-Preserving Low-Resolution Face Recognition

Face recognition (FR) has become a key technology in multiple applications. In recent years we have witnessed the great progress of convolutional neural networks (CNNs) in face recognition. Although current deep learning-based face recognition algorithms have achieved very promising performance on public datasets, their performance is heavily degraded when methods are tested with low-resolution face images. This problem is particularly critical in surveillance applications. One of the most straightforward solutions is to properly interpolate the low-resolution faces with super-resolution methods and then perform face recognition. However, such up-sampled data often lacks sufficient identity information for deep models.

The objective of this project is to investigate an identity-preserving low-resolution face recognition system. The student will first investigate the current state-of-the-art in this area as well as the current most popular face recognition pipelines and databases. Moreover, the student is expected to come up with an end-to-end solution for this task.

In general, the following tasks shall be performed by the student:

  • Study the state-of-the-art deep learning-based face recognition algorithms and super-resolution algorithms
  • Investigate or create suitable face datasets and design evaluation protocols for this specific task
  • Establish a generic face recognition pipeline as a baseline
  • Explore an end-to-end low-resolution face recognition solution
  • Document the code and results and write a report on the project

Requirements: Background in image processing and deep learning. Good skills in Python programming.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Thesis, Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One


Measuring the Influencing Factors for AI-based Face Recognition

Face recognition (FR) has become a key technology in multiple applications. In recent years we have witnessed the great progress of convolutional neural networks (CNNs) in face recognition. Although a black-box approach based on deep learning can boost the performance, it is hard to understand the decision and, more importantly, how to improve weaknesses. 

The objective of this project is to identify key factors that will influence the face recognition system and to provide a reasonable description of underlying mechanisms. The student is expected to investigate possible influencing factors, including but not limited to data quality-related factors, human-related factors, and deep-model related factors. The student will study at least one type of influencing factor and create evaluation metrics and protocols, which should quantitatively demonstrate the impact of such a factor. Afterward, the student is to provide insights on how to understand and explain the decision made by the system.

In general, the following tasks shall be performed by the student:

  • Study the state-of-the-art deep learning-based face recognition methods and popular databases
  • Design and implement a face recognition pipeline
  • Investigate a number of selected influencing factors and design evaluation protocols to measure the impact on the selected face recognition pipeline
  • Investigate the rationale behind decisions made by the FR system based on experiments
  • Document the code and results and write a report on the project

Requirements: Background in image processing and deep learning. Good skills in Python programming.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: More than One


Exploring Robust AI-based Face Recognition Methods


Face recognition (FR) has become a prominent biometric technology in our society, frequently used in multiple areas, such as access control, video surveillance. Although current deep learning-based face recognition algorithms have achieved very promising performance on public datasets, modern AI-based face recognition systems are vulnerable to low-resolution, ill-posed, and noise corrupted face images. In fact, a large quantity of outdated and cheap cameras is widely deployed in the real world, capturing low-quality images for face recognition systems. On the other hand, human in the wild is often under-constrained with flexible head poses. Therefore, a more robust deep face recognition system is desired for practical usage.

The objective of this project is to investigate more robust AI-based face recognition systems for realistic scenarios. The student will first review the current state-of-the-art AI-based face recognition methods. Then, a more specific realistic influencing factor, such as head pose variation, will be analyzed in detail. The student is expected to come up with an improved face recognition system that is more robust toward the selected influencing factor.

The following tasks should be performed during the project:

  • Study the state-of-the-art deep learning-based face recognition algorithms.
  • Establish a generic face recognition pipeline as a baseline.
  • Study one specific realistic influencing factor and improve the current face recognition method against the selected factor.
  • Investigate suitable face datasets and set up evaluation protocols.
  • Document the code and results and write a report on the project.

Requirements: Background in image processing and deep learning. Good skills in Python programming. Experience in face recognition is preferred.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One.


Identity-preserving Learning-based Compression Methods for Face Recognition System


In a typical face recognition system, a camera captures the face image and sends it to the recognition module, where the image is compared with those faces saved in the storage system. Image compression operation is deeply involved in the entire pipeline and often causes a negative impact on the recognition accuracy. In fact, this technique is widely applied to different image and video processing and streaming systems to ease the spreading or storage. In general, a higher compression factor releases the burden of storage and transmission, but it will also introduce more artifacts to the image and result in bad recognition accuracy. Therefore, it is necessary to explore a new coding framework that optimizes both tasks with a special attention on the identity information on human faces. In this project, the possibility of combining learning-based compression flow and face recognition module is explored.

The objective of this project is to investigate a machine-oriented learning-based compression technique, which is designed for a deep face recognition system. The student will first review the state-of-the-art of learning-based compression methods and deep face recognition methods. Then, the student will implement an end-to-end workflow that combines both techniques. Moreover, the student is encouraged to propose a better identity-preserving solution.

The following tasks should be performed during the project:

  • Study the state-of-the-art deep learning-based face recognition algorithms and image compression algorithms.
  • Establish an end-to-end workflow that combines image compression and face recognition modules.
  • Propose a technique that preserves more identity information during compression.
  • Design an assessment approach to evaluate the performance of both the compression module and face recognition module.
  • Document the code and results and write a report on the project.

Requirements: Background in image processing and deep learning. Good skills in Python programming. Experience in face recognition is preferred.

Contact: Yuhang Lu

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One.


Point cloud compression using deep neural networks


Point cloud imaging has recently emerged as a viable solution for immersive 3D content representation in augmented, mixed and virtual reality applications. The vast amount of data, though, needed to faithfully reproduce real-world sceneries with this type of imaging makes inevitable the demand for efficient compression solution.

Visual data compression typically comes at the expense of distortions and the presence of artifacts that affect the visual quality of the compressed models. Thus, it is of crucial importance to not only reduce the amount of data needed to represent a model, but also to maintain the highest possible visual quality. Many point cloud compression schemes are currently based either on efficient geometrical data structures, or on projection-based solutions. Recently, deep convolutional neural networks have been proposed for point cloud compression purposes, showing remarkable performance gains with respect to the alternative solutions. Such technologies have attracted the attention of standardization groups such as JPEG, which recently launched a call for proposals for a learning-based point cloud compression standard and will soon get in its collaborative phase.

Inserted in this context, the goal of this project is to enhance the performance of current learning-based models for point cloud compression. The modules implemented by the student should aim to bring improvements to the compression technologies being analyzed by JPEG in its collaborative phase.

The following tasks should be performed during the project:

  • Study the state-of-the-art in deep neural network for point cloud representations.
  • Design and implement a deep neural network coding module for point cloud compression or improvements to current modules.
  • Train the network and obtain the network parameters.
  • Quality assessment of the performance of the network.
  • Document all the development process and source code.

Requirements: Good background on image processing and machine learning. Good skills in programming.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Interactive subjective evaluation of point clouds in a 3D monitor


The use of point clouds is continuously increasing as an imaging modality for 3D content, stimulated by important use cases such as Virtual and Augmented Reality. In order to study the human perception with this kind of content, subjective experiments can be conducted, where subjects rate a sequence of contents presented to them in sequence. In many subjective experiments, standard videos are generated from this 3D models by moving a virtual camera around the object, not taking advantage of the increased level of immersion that point clouds allow to achieve. Recent studies have explored more immersive subjective protocols through web-based interactive interfaces [1] and 3D light field monitors with spatial rendering [2].

The goal of this project is to extend the framework in Unity [1] developed for a 3D light field monitor to allow for an interactive subjective experience with enhanced immersion. The student will conduct the development on a standard 2D monitor, and once the implementation is finalized, it will be tested in the 3D light field monitor. Finally, a subjective experiment will be conducted to validate the platform.

The following tasks should be performed during the project:

  • Study the state of the art on point cloud subjective quality assessment.
  • Extend the current subjective framework on Unity to allow for user interaction.
  • Test and validate the platform both in a standard monitor and in the light field display. 
  • Select a point cloud dataset and apply realistic types of distortion.
  • Conduct a subjective experiment with the developed platform and the distorted dataset.
  • Document all the development process and source code.

Requirements: Background on image processing. Good skills in programming. Experience with C++ and 3D development platforms such as Unity will also be helpful.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.

[1] Alexiou, Evangelos, et al. “A comprehensive study of the rate-distortion performance in MPEG point cloud compression.” APSIPA Transactions on Signal and Information Processing 8 (2019).

[2] Lazzarotto, Davi, et al. ” On the impact of spatial rendering on point cloud subjective visual quality assessment.” 14th International Conference on Quality of Multimedia Experience (QoMEX) (2022).


Web platform for crowdsourcing interactive subjective evaluation of point clouds


The use of point clouds is continuously increasing as an imaging modality for 3D content, stimulated by important use cases such as Virtual and Augmented Reality. In order to study the human perception with this kind of content, subjective experiments can be conducted, where subjects rate a sequence of contents presented to them in sequence. In many subjective experiments, standard videos are generated from this 3D models by moving a virtual camera around the object, not taking advantage of the increased level of immersion that point clouds allow to achieve. Recent studies have allowed for more immersive subjective experiments through web-based interactive interfaces [1] and 3D light field monitors with spatial rendering [2].

The goal of this project is to extend the web framework used for an interactive experiment in a lab environment and make it able to work with crowdsourcing. Contrarily to [1], where the experiment was being run locally, the platform developed by the student will have a server running the experiment and multiple subjects accessing it at the same time from their personal devices. The platform will then be validated with a crowdsourcing experiment that will be conducted by the student.

The following tasks should be performed during the project:

  • Study the state of the art on point cloud subjective quality assessment.
  • Extend the current web-based subjective framework to work from a server rather than locally.
  • Select a point cloud dataset and apply realistic types of distortion.
  • Conduct a subjective experiment with the developed platform and the distorted dataset.
  • Document all the development process and source code.

Requirements: Background on image processing. Good skills in programming. Experience with web development is a plus.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.

[1] Alexiou, Evangelos, et al. “A comprehensive study of the rate-distortion performance in MPEG point cloud compression.” APSIPA Transactions on Signal and Information Processing 8 (2019).

[2] Lazzarotto, Davi, et al. ” On the impact of spatial rendering on point cloud subjective visual quality assessment.” 14th International Conference on Quality of Multimedia Experience (QoMEX) (2022).


Adversarial attacks on learning-based point cloud compression


Deep neural networks are being used in many computer vision applications nowadays such as classification, segmentation and object detection and recently compression. Not only images, but also 3D imaging modalities such as point clouds, are being targeted by such algorithms. Despite the rapid expansion of deep learning, neural networks have been found to be vulnerable to adversarial attacks, where slight modifications applied to the input can severely impact the output. This has recently found to be true in learning-based image compression [1], for which the most recent and popular models were tested.

The goal of this project is to analyse the vulnerability to adversarial attacks of state-of-the-art learning-based algorithms for point cloud compression. The student will study different ways of applying small modifications to point cloud models in such a way that the resulting point cloud after compression and decompression is heavily degraded. The retraining of the compression model to make it more robust to adversarial attacks can also be conducted as an optional step.

The following tasks should be performed during the project:

  • Study the state of the art on point cloud subjective quality assessment.
  • Extend the current web-based subjective framework to work from a server rather than locally.
  • Select a point cloud dataset and apply realistic types of distortion.
  • Conduct a subjective experiment with the developed platform and the distorted dataset.
  • Document all the development process and source code.

Requirements: Background on deep learning and image processing. Good skills in programming.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.

[1] Chen, Tong, and Zhan Ma. “Towards Robust Neural Image Compression: Adversarial Attack and Model Finetuning.” arXiv preprint arXiv:2112.08691 (2021).


Point cloud quality evaluation using deep neural networks


Recent trends show that 3D imaging technologies will dominate the market in the near future. Among the alternatives, point clouds denote a viable solution that has recently emerged for immersive content representation, proven by the current activities of JPEG and MPEG standardization committees. Yet, one of the open problems in this emerging field is the assessment of quality of models under typical degradations. In this project, the task is to investigate and propose new, point-based objective quality metrics for point cloud models.

In essence, a point cloud can be defined as a collection of 3D points in space representing the external surface of an object. Each sample is defined by its position, while associated attributes may also be used in conjunction, in order to provide further information (e.g., color, normal vectors). The set of points that represent the 3D model can be interpreted as a (generally) irregularly sub-sampled surface. The goal of this project is to explore whether deep neural networks can be used to model perceptual similarity and correctly estimate the impact of color or geometry distortions in point cloud quality.

The following tasks should be performed during the project:

  • Study the state-of-the-art in objective quality assessment of point clouds.
  • Analyze alternatives for a learning-based objective quality metric.
  • Propose and implement an algorithm.
  • Benchmark the proposed algorithm.
  • Document all the development process and source code.

Requirements: Good analytic skills. Good background in image and video processing, or machine learning is required.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Point cloud classification on the compressed domain


The use of point clouds is continuously increasing as an imaging modality for 3D content, stimulated by important use cases such as Virtual Reality and Autonomous Driving. Due to the high amount of data needed for their representation, efficient compression methods have been proposed in the literature, with learning-based methods achieving competitive performance. These solutions open the possibility for performing computer vision tasks such as point cloud classification directly on the compressed bitstream rather than on the distorted decompressed model. Early studies on 2D images have already demonstrated that this technique can achieve superior performance in certain situations, but it is still unclear whether these conclusions would still hold for point clouds.

The goal of this project is to adapt learning-based algorithms for point cloud classification to operate directly on the compressed domain. The student will explore whether this technique reduces or improve the accuracy of the computer vision task. The implemented methods will be assessed by comparing their performance when applied both to the compressed domain and to the distorted decompressed point cloud models.

The following tasks should be performed during the project:

  • Study the state of the art on learning-based point cloud compression methods.
  • Study the state of the art on point cloud classification algorithms.
  • Select a point cloud compression method and a dataset for point cloud classification. 
  • Adapt a point cloud classification algorithm from the literature to operate on the compressed domain.
  • Assess the performance of the algorithm at different compression levels.
  • Document all the development process and source code.

Requirements: Background on deep learning and image processing. Good skills in programming.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Point cloud super resolution on the compressed domain


The use of point clouds is continuously increasing as an imaging modality for 3D content, stimulated by important use cases such as Virtual Reality and Autonomous Driving. Due to the high amount of data needed for their representation, efficient compression methods have been proposed in the literature, with learning-based method achieving competitive performance. These solutions open the possibility for performing 3D processing tasks such as point cloud super resolution directly on the compressed bitstream rather than on the distorted decompressed models. Early studies on 2D images have already demonstrated that this technique can achieve superior performance in certain situations, but it is still unclear whether these conclusions would still hold for point clouds.

The goal of this project is to adapt learning-based algorithms for point cloud super resolution to operate directly on the compressed domain. The student will explore whether this technique reduces or improve the performance of the 3D processing task. The implemented methods will be assessed by comparing their performance when applied both to the compressed domain and to the distorted decompressed point cloud models.

The following tasks should be performed during the project:

  • Study the state of the art on learning-based point cloud compression methods.
  • Study the state of the art on point cloud super resolution algorithms.
  • Select a point cloud compression method and a dataset for point cloud super resolution. 
  • Adapt a point cloud super resolution algorithm from the literature to operate on the compressed domain.
  • Assess the performance of the algorithm at different compression levels.
  • Document all the development process and source code.

Requirements: Background on deep learning and image processing. Good skills in programming.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project, or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.


Learning based 3D reconstruction for RGBD images


With the rapid development of VR/AR technology and the increased demand for autonomous driving and related applications, 3D models are becoming popular. Several methods such as laser scanners, LIDAR and multi-view systems have been developed to capture 3D objects, with RGBD cameras being a promising and affordable option.

In addition to capturing RGB images like a conventional camera, RGBD cameras also generate depth maps for the scene, providing additional spatial information for the generation of 3D objects. A typical RGBD camera is the Azure Kinect camera made by Microsoft. To build 3D objects from 2D RGBD images taken with different camera poses, 3D reconstruction algorithms are developed to do alignment and merge different RGBD images into one 3D model.

Some traditional algorithms such as the classical KinectFusion [1] use an ICP-based approach to calculate the estimated pose of the camera and update the volume representation with the TSDF algorithm. Benefiting from recent developments in DNNs, neural networks have been introduced for 3D reconstruction tasks. In [2], the authors proposed the use of NeRF (Neural Radiation Field) to implicitly represent 3D scenes and produce more detailed and complete reconstruction results than traditional methods. The DNN-based 3D reconstruction method is promising with high accuracy.

The goal of this project is to explore 3D reconstruction of RGBD images and to attempt to improve the reconstruction quality or speed by using DNNs. Our goal is to study existing 3D reconstruction algorithms, propose improvements and validate them on open source 3D datasets. The following tasks will be performed during the project:

  • Study existing 3D reconstruction algorithms for RGBD images.
  • Implement conventional and DNN based RGBD 3D reconstruction algorithms on open-source datasets and self-generated datasets.
  • Explore possible methods to improve the reconstruction quality and the speed.
  • Document all the development process and source code.

Requirements: Good background on image processing and programming.

Contact: Bowen Huang

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.

[1]R. A. Newcombe et al., “KinectFusion: Real-time dense surface mapping and tracking,” 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 2011, pp. 127-136, doi: 10.1109/ISMAR.2011.6092378.

[2] D. Azinović, R. Martin-Brualla, D. B. Goldman, M. Nießner and J. Thies, “Neural RGB-D Surface Reconstruction,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6280-6291, doi: 10.1109/CVPR52688.2022.00619.


Neural Radiance Field based image processing


Neural Radiation Fields, or NeRF, is an emerging technique that can be applied to a variety of image processing and computer vision tasks such as deblurring, compression and 3D reconstruction. NeRF is based on the idea of using neural networks to fit a function that can implicitly represent input data, which for image processing is an explicit representation of an image, video or other visual data structure, such as 3D point clouds and voxels.

In [1], the NeRF was proposed to generate new views for 3D scenes. The NeRF uses images taken at different viewpoints as the input training dataset and then generates new views based on implicit 3D functions and voxel rendering. Subsequent work has been done to improve the basic NeRF, for example, improving speed and generalisability, and extending its application to various domains, including low-level image processing tasks. Interested readers can refer to [2][3] for more information.

The goal of this project is to explore the possibilities of using NeRF for image processing tasks and to evaluate its performance. Some possible directions include image/video compression, dynamic 3D point cloud reconstruction and compression. Exploration of other applications is also welcome. The following tasks will be performed during this project:

  • Study the latest NeRF papers and related image processing knowledge.
  • Implement a classical algorithm and a NeRF-based algorithm for a specific image processing task on an open-source dataset and evaluate and compare experimental results.
  • Explore possible ways to improve NeRF-based image processing methods or design new application scenarios.
  • Document all the development process and source code.

Requirements: Good background on image processing and programming.

Contact: Bowen Huang

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project, Master Semester Project or Master Thesis in Electrical Engineering, Communication Systems, Computer Science, or equivalent.

Number of students: One.

[1] Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R. (2020). NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12346. Springer, Cham. https://doi.org/10.1007/978-3-030-58452-8_24

[2] L. Ma et al., “Deblur-NeRF: Neural Radiance Fields from Blurry Images,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 12851-12860, doi: 10.1109/CVPR52688.2022.01252.

[3] D. Azinović, R. Martin-Brualla, D. B. Goldman, M. Nießner and J. Thies, “Neural RGB-D Surface Reconstruction,” 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 6280-6291, doi: 10.1109/CVPR52688.2022.00619.


Latent Space Deep Feature Compression for Face Recognition


Face recognition has become a hot research topic in the last decades. In state-of-the-art methods, deep neural networks are used to extract features from the images, and then recognition is performed by comparing these features between two images. While this method demonstrated good performance on lossless images, compression and distortion on image in real scenario has proven to cause a lower accuracy. In this case, in fact, features extracted from distorted images are different from those extracted from images losslessly compressed, resulting in a lower accuracy in face recognition. Regarding this problem, a possible solution is to compress the extracted features rather than the images. Nevertheless, an alternative solution would be to jointly train the feature extraction network and compression network.

The objective of this project is to investigate lossless feature compression for the face recognition task.

The following tasks should be performed during the project:

  • Study the state-of-the-art on deep learning-based face recognition followed by identification of the best network to use during the project.
  • Add a few layers to the face recognition network as the encoder and decoder.
  • Jointly train the recognition network and compression network.
  • Visualize the feature histogram and compute the face recognition accuracy.
  • Explore the relationship between the bitrate and the accuracy, and plot rate-distortion curves.
  • Document all the development process and source code.

Requirements: Background in image processing and deep learning. Good skills in Python and Pytorch programming. Experience in face recognition is preferred.

Contact: Changsheng Gao

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One.


Latent Space Deep Feature Compression for Pedestrian Re-Identification


Pedestrian re-identification has become a hot research topic in the last few years. Instead of search a query image in gallery directly, the state-of-the-art methods train a neural network to extract features from both a query and gallery image. Succesively, for each query feature, the most similar feature from gallery features is searched, considering it as matched feature. However, in a real scenario, the extracted features are compressed, resulting in a domain gap between the original feature and the compressed feature, which cause low accuracy in pedestrian re-identification applications. To overcome this problem, features may be compressed losslessly, the feature extraction network and compression network may be trained.

The objective of this project is to investigate lossless feature compression for pedestrian re-identification task. The student will first review the problem of pedestrian re-identification, by analyzing the advantages and disadvantages on the methods that are currently available in the state of the art. Then, given an already implemented the baseline, the student will explore further solutions, analyzing in details the obtained performance.

The following tasks should be performed during the project:

  • Study the state-of-the-art deep learning-based pedestrian re-identification algorithms.
  • Run the baseline code provided by us.
  • Identify the lowest bitrate that can achieve the same accuracy as the original feature.
  • Visualize the feature distribution and analyze the reasons of the performance drop.
  • Explore the relationship between bitrate and accuracy, by creating rate-distortion plots.
  • Document all the development process and source code.

Requirements: Background in image processing and deep learning. Good skills in Python and Pytorch programming. Experience in pedestrian re-identification is preferred.

Contact: Changsheng Gao

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project in Electrical Engineering, Communication Systems, Computer Science, and equivalent.

Number of students: One.


Mobile App for Privacy Protection on iOS Platform


Recently, public interest in privacy protection has increased dramatically. However, there is a general believe that protection of privacy will restrict online benefits of users. Therefore, protection of privacy in such a way that does not distract online habits of people is needed. This project focuses on visual privacy protection in images. Specifically, the intention of the project is to develop a mobile (iOS platform) application that would be able to obfuscate personal visual information in an image in a secure and recoverable way and share images via online social networks in a secure way.

The following tasks should be performed during the project:

  • Research and review the existing visual privacy protection tools, as well as the way to share and manage secure content in social networks.
  • Design an app on smartphone with iOS operating system. An iPhone will be provided by the lab.
  • Minimal requirements of the app include:
    • Implementation of security processing (e.g. scrambling) for images on the mobile side.
    • Multi-region processing on image using touch screen.
  • Implementation of a simple key management system.

Requirements: Basic knowledge of image processing, good programming skills in Objective-C, experience in iOS development.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Project or Master Semester Project in Electrical Engineering, Communication Systems, or Computer Science.

Number of students: One.


Mobile App for Privacy Protection on iOS Platform

Recently, public interest in privacy protection has increased dramatically. However, there is a general believe that protection of privacy will restrict online benefits of users. Therefore, protection of privacy in such a way that does not distract online habits of people is needed. This project focuses on visual privacy protection in images. Specifically, the intention of the project is to develop a mobile (Android platform) application that would be able to obfuscate personal visual information in an image in a secure and recoverable way and share images via online social networks in a secure way.

The following tasks should be performed during the project:

  • Research and review the existing visual privacy protection tools, as well as the way to share and manage secure content in social networks.
  • Design an app on smartphone with Android operating system. An Android phone will be provided by the lab.
  • Minimal requirements of the app include:
    • Implementation of security processing (e.g. scrambling) for images on the mobile side.
    • Multi-region processing on image using touch screen.
  • Implementation of a simple key management system.

Requirements: Basic knowledge of image processing, good programming skills in Java, experience in Android development.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Project or Master Semester Project in Electrical Engineering, Communication Systems, or Computer Science.

Number of students: One.


Privacy Preserving Photo-Sharing Application

The rapid growth of photo sharing through social media raises serious questions related to ownership, privacy and access to shared images. From the user perspective, effective privacy protection tends to impose restrictions on how users share and access pictures, making privacy protection unattractive. To address such issues, MMSPG has developed ProShare, a mobile App through which pictures can be protected, shared and made selectively accessible in a transparent manner while incurring minimal distraction to the user. To date a large effort has been invested in the development and implementation of the ProShare mobile App. Less attention has been directed at the server side realisation of ProShare.

The objective of this student project is to enhance the server side implementation of the ProShare service.

The following tasks should be performed during the project:

  • Study and understand the ProShare service and its implementation (both server side and client side)
  • Review server side architecture and compare this to state-of-the-art implementations for similar services
  • Propose modifications and enhancements to the existing server side implementation
  • Implement a migration process allowing to move the ProShare server to a new computing platform
  • Implement robust and reliable session management
  • Propose and implement additional service features
  • Implement back-end tools for the analysis of usage and user statistics

Requirements: Good communications skills. Good understanding of web server technologies including Apache, MySQL and PHP. Good abilities to think at the systems level.

Contact: Davi Lazzarotto

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Semester Project in Computer Science, Communications Systems, Electrical Engineering, or equivalent.

Number of students: One


Vector Quantisation for Learned Point Cloud Compression

Point cloud, as a promising representation for 3D vision, has wide applications in AR/VR/MR, autonomous driving and GLAM (galleries, libraries, archives and museums). However, the huge data volume of raw point clouds is impractical for storage and transmission, therefore point cloud compression algorithms are needed.

 

To improve the compression performance, lossy compression is thoroughly studied and the performance is evaluated by various distortion assessment metrics, such as PSNR and SSIM. As a crucial part of lossy compression, quantisation has a large influence on the balance of compression ratio and distortion. Currently, inspired by the mature image and video compression methods, simple scalar quantisation is commonly used, while the efficiency is not information-theoretically optimal compared to vector quantisation. Moreover, the success of learning-based image and video compression also leads to similar methods in point cloud compression, and the combination of novel vector quantisation with learning-based framework can further boost the performance for point cloud compression.

 

The goal of this project is to explore the combination of vector quantisation with learned point cloud geometry compression methods, and to improve the performance of the existing methods. The following tasks should be performed during the project:

  • Study the state-of-the-art in point cloud compression and quantisation.

  • Design and implement a vector quantisation module.

  • Evaluate the performance of the vector quantisation module on a learned point cloud compression.

  • Document all the development process and source code.

Requirements: background in image processing and familiar with coding.

Contact: Bowen Huang

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Project or Master Project in Computer Science, Communications Systems, Electrical Engineering, or equivalent.

Number of students: One


Learning-based Dynamic Point Cloud Compression

High-efficient compression techniques are necessary for the application of 3D vision, such as autonomous driving and AR and VR, and point cloud is one of the promising 3D vision representations. The challenges of point cloud compression (PCC) come from the irregularity, i.e., the number of points is inconsistent and their coordinates are irregular within and between frames, which is different from conventional video. Therefore, special compression algorithms are needed for PCC. Inspired by the great success achieved by learning-based methods in image and video compression, similar structures are migrated to PCC, but modifications are needed. For example, 3D convolution and the transformer for vision tasks (ViT) are among promising avenues.

 

As some basic framework has been studied for static PCC, learning-based methods for dynamic point cloud sequences can be built upon these foundations. The compression of point cloud sequences is similar to video compression, where motion information should be analyzed for better redundancy extraction. However, the irregularity arises new challenges, and some key modules, such as motion analyzer, should be re-designed.

 

In this project, the goal is to study key techniques for dynamic point cloud compression and enhance their compression performance. The following tasks should be performed during the project:

  • Study the state-of-the-art in deep learning-based point cloud compression, with a focus on dynamic point clouds.

  • Design and implement one or more modules for point cloud coding using neural networks.

  • Train the network and evaluate the performance using objective metrics.

  • Document all the development process and source code.

Requirements: background in image processing and familiar with coding and deep learning framework.

Contact: Bowen Huang

Group: Prof. Touradj Ebrahimi

Suitable for: Master Semester Project in Computer Science, Communications Systems, Electrical Engineering, or equivalent.

Number of students: One


Hardware-efficient Network for Learning-based Point Cloud Compression

Compression techniques are essential for 3D vision as 3D data, such as point cloud and voxel, are sparse and their huge data volume raises big challenges for storage and transmission. Taking point cloud as an example, currently, inspired by the success achieved by learning-based image and video compression, similar architectures are migrated to 3D point cloud compression and have outperformed handcrafted point cloud codecs. However, they require extensive resources and powerful hardware, such as GPU, which is unrealistic for embedded systems. Also, the computational efficiency can be optimised for these learning-based compression methods to save energy.

 

The hardware-efficiency improvement for learning-based algorithms involves many aspects, including memory access optimisation, computational improvements and neural model compression. Also, the final solution should be considered thoroughly to keep a balance between compression quality, speed and energy consumption. By integrating these techniques with existing learning-based algorithms, it’s promising to develop high-efficient point cloud codec for practical 3D vision tasks.

 

The objective of this project is to improve the hardware-efficiency of existing learning-based point cloud compression algorithms. The following tasks should be performed during the project:

  • Literature review on learning-based point cloud compression methods and hardware-efficient deep learning technologies.

  • Design and implement a hardware-efficient neural network for point cloud compression.

  • Train the implemented network, and evaluate it with hardware-specific objective metrics.

  • Document all the development process and source code.

Requirements: background in image processing and familiar with coding and deep learning framework.

Contact: Bowen Huang

Group: Prof. Touradj Ebrahimi

Suitable for: Bachelor Project or Master Project in Computer Science, Communications Systems, Electrical Engineering, or equivalent.

Number of students: One