Tutorials

Organizer

  • Ali C. Begen, Professor, Ozyegin University
    (IEEE ComSoc Distinguished Lecturer for 2016-2020)

Abstract

HTTP adaptive streaming is a complex technology with dynamics that need to be studied thoroughly. The experience from the deployments in the last 10+ years suggests that streaming clients typically operate in an unfettered greedy mode and they are not necessarily designed to behave well in environments where other clients exist or network conditions can change dramatically. This largely stems from the fact that clients make only indirect observations at the application (HTTP) layer (and limitedly at the transport layer, if any at all). Typically, there are three primary camps when it comes to scaling and improving streaming systems: (𝑖) servers control client’s behavior/actions and the network uses appropriate QoS, (𝑖𝑖) servers and clients cooperate with each other and/or the network, or (𝑖𝑖𝑖) clients stay in control and no cooperation with the servers or network is needed as long as there is enough capacity in the network (said differently, use dumb servers and network and throw more bandwidth at the problem). Intuitively, using hints should improve streaming since it helps the clients and servers take more appropriate actions. The improvement could be in terms of better viewer experience and supporting more viewers for the given amount of network resources, or the added capability to explicitly support controlled unfairness (as opposed to bitrate fairness) based on features such as content type, viewer profile and display characteristics.

In this tutorial, we will examine the progress made in this area over the last several years, primarily focusing on the MPEG’s Server and Network Assisted DASH (SAND) and CTA’s Common Media Client/Server Data standards. We will also describe possible application scenarios and present an open-source sample implementation for the attendees to explore this topic further in their own, practical environments.

Speaker Bio

Ali C. Begen is currently a computer science professor at Ozyegin University and a technical consultant in the Advanced Technology and Standards group at Comcast. Previously, he was a research and development engineer at Cisco. Begen received his PhD in electrical and computer engineering from Georgia Tech in 2006. To date he received a number of academic and industry awards, and was granted 30+ US patents. In 2020 and 2021, he was listed among the world's most influential scientists in the subfield of networking and telecommunications. More details are at https://ali.begen.net.

Organizer

  • Ivan V. Bajić, Professor, Simon Fraser University, Canada

Abstract

Our world is at the beginning of the technological revolution that promises to transform the way we work, travel, learn, and live, through Artificial Intelligence (AI). While AI models have been making tremendous progress in research labs and overtaking scientific literature in many fields, efforts are now being made to take these models out of the lab and create products around them, which could compete with established technologies in terms of cost, reliability, and user trust, as well as enable new, previously unimagined applications. Foremost among these efforts involves bringing AI “to the edge” by pairing it with the multitude of sensors that is about to cover our world as part of the Internet of Things (IoT) and 5th generation (5G) communication network initiatives. This tutorial is about edge-cloud collaborative analysis of multimedia signals, which we shall refer to as Collaborative Intelligence (CI). This is a framework in which AI models, developed for multimedia signal analysis, are distributed between the edge devices and the cloud. In CI, typically, the front-end of an AI model is deployed on an edge device, where it performs initial processing and feature computation. These intermediate features are then sent to the cloud, where the back-end of the AI model completes the inference. CI has been shown to have the potential for energy and latency savings compared to the more typical cloud-based or fully edge-based AI model deployment, but it also introduces new challenges, which require new science and engineering principles to be developed in order to achieve optimal designs. In CI, a capacity-limited channel is inserted in the information pathway of an AI model. This necessitates compression of features computed at the edge sub-model, which in turn requires a solid understanding of the structure of the model’s latent space. Errors introduced into features due to channel imperfections would need to be handled at the cloud side in order to perform successful inference. Moreover, issues related to the privacy of transmitted data need to be addressed.

Speaker Bio

Ivan V. Bajić received the B.Sc.Eng. degree (summa cum laude) in Electronic Engineering from the University of Natal, South Africa, in 1998, and the M.S. degree in Electrical Engineering, the M.S. degree in Mathematics, and the Ph.D. degree in Electrical Engineering from Rensselaer Polytechnic institute, Troy, NY, USA, in 2000, 2002, and 2003, respectively. He was with the University of Miami 2003-2005, subsequently joining Simon Fraser University in Burnaby, BC, Canada, where he is currently a Professor of Engineering Science and co-director of the SFU Multimedia Lab. His research interests include signal processing and machine learning with applications to multimedia processing, compression, communications, and collaborative intelligence. His group’s work has received awards at ICME 2012 and ICIP 2019, and other recognitions (e.g., paper award finalist, top n%) at Asilomar, ICIP, ICME, and CVPR. It was also featured in the IEEE Signal Processing Magazine (May 2020), the front page of the IEEE Transactions on Audio, Speech, and Language Processing (July/August 2016), among the Featured Articles in IEEE Transactions on Image Processing and IEEE Transactions on Multimedia, as well as popular media such as Vancouver Sun, Plank Magazine, and CBC Radio. He has received an NSERC DAS Award in 2021 for his work on collaborative intelligence.

Ivan is the incoming Chair (2022-2023) of the IEEE Multimedia Signal Processing Technical Committee and a Member of the IEEE Multimedia Systems and Applications Technical Committee. He has served on the organizing and/or program committees of the main conferences in his field, and has received several awards in these roles, including Outstanding Reviewer Award (five times), Outstanding Area Chair Award, and Outstanding Service Award. He was the Chair of the Vancouver Chapter of the IEEE Signal Processing Society 2013-2019, during which the Chapter received the Chapter of the Year Award from IEEE SPS. He was an Associate Editor of the IEEE Transactions on Multimedia and the IEEE Signal Processing Magazine, and is currently a Senior Area Editor of the IEEE Signal Processing Letters.

Organizers

  • Yuezun Li, Professor, Ocean University of China, China
  • Siwei Lyu, Professor, SUNY Buffalo, US

Abstract

AI techniques, especially deep neural networks (DNNs) significantly improve the reality of falsified multimedia, leading to a severely disconcerting impact on society. In particular, the AI-based face forgery, known as DeepFake, is one of the most recent AI techniques that attracts increasing attention due to its ease of use and powerful performance. To counter the negative impact of DeepFake, the defense strategies are developed instantly such as the detection, ie, distinguishing forged content, and obstruction, ie, preventing the synthesis of forged content. In this tutorial, we plan to provide a review of the fundamentals in the creation of DeepFakes, and the recent advances in the detection and obstruction methods.

Speaker Bios

Yuezun Li is a lecturer in the Center on Artificial Intelligence, at Ocean University of China. He was a Senior Research Scientist at the Department of Computer Science and Engineering of University at Buffalo, SUNY from 2020.09 to 2020.12. He received Ph.D. degree in computer science at University at Albany, SUNY in 2020. He received M.S. degree in Computer Science in 2015 and B.S. degree in Software Engineering in 2012 at Shandong University. Dr. Li’s research interest is mainly focused on artificial intelligence security and multimedia forensics. His work has been published in peer-reviewed conference and journals, including ICCV, CVPR, ICASSP, CVIU, etc.

Siwei Lyu is an SUNY Empire Innovation Professor at the Department of Computer Science and Engineering, the Director of UB Media Forensic Lab (UB MDFL), and the founding Co-Director of Center for Information Integrity (CII) of University at Buffalo, State University of New York. Dr. Lyu's research interests include digital media forensics, computer vision, and machine learning. Dr. Lyu has published over 170 refereed journal and conference papers. Dr. Lyu's research projects are funded by NSF, DARPA, NIJ, UTRC, IBM and Department of Homeland Security. He is the recipient of the IEEE Signal Processing SocietyBest Paper Award (2011), the National Science Foundation CAREER Award (2010), SUNY Albany's Presidential Award for Excellence in Research and Creative Activities (2017), SUNY Chancellor's Award for Excellence in Research and Creative Activities (2018) Google Faculty Research Award (2019), and IEEE Region 1 Technological Innovation (Academic) Award (2021). Dr. Lyu served on the IEEE Signal Processing Society's Information Forensics and Security Technical Committee (2016 - 2021), and was on the Editorial Board of IEEE Transactions on Information Forensics and Security (2016-2021). Dr. Lyu is a Fellow of IEEE.

Organizer

  • Yipeng Liu, Professor, University of Electronic Science and Technology of China

Abstract

Many classical data processing methods rely on representation and computation in the form of vectors and matrices, where multi-dimensional data are unfolded into matrices for processing. However, the multi-linear structure would be lost in such vectorization or matricization, which leads to sub-optimal performance in processing. In fact, a natural representation for multi-dimensional data is a tensor. The tensor computation-based data processing methods can avoid multi-linear data structure loss in classical matrix based counterparts. The related advances in applied mathematics allow us to move from classical matrix based methods to tensor based methods for many applications, such as signal processing, machine learning, neuroscience, communication, psychometric, chemometrics, biometric, quantum Physics, quantum chemistry, etc. As typical kinds of multi- dimensional data, multimedia data could be more efficiently and effectively processed by tensor computations based data processing techniques. This tutorial will first provide a basic coverage of tensor notations, preliminary operations, main tensor decompositions and their properties. Based on them, a series of tensor based data processing methods are presented, as the multi-linear extensions of classical sparse learning, missing component analysis, principal component analysis, subspace cluster, linear regression, support vector machine, deep neural network, etc. The experimental results for a number of multimedia applications are given, such as image reconstruction, image quality enhancement, multimedia data fusion, background extraction, weather forecasting, pose estimation, source separation in speech, etc. Finally, some acceleration strategies are discussed for some more possible applications.

Speaker Bio

Yipeng Liu received the BSc degree in biomedical engineering and the PhD degree in information and communication engineering from University of Electronic Science and Technology of China (UESTC), Chengdu, in 2006 and 2011, respectively. From 2011 to 2014, he was a research fellow at University of Leuven, Leuven, Belgium. Since 2014, he has been an associate professor with UESTC, Chengdu, China.

His research interests are tensor computations for data processing. He has been developing new tensor decompositions and sparse optimizations for data processing techniques, including signal recovery, image quality enhancement, image and video compression, spectrum sensing and prediction, collaborative filtering, efficient neural networks, adversarial attack, anomaly detection, etc. He has co-authored two books titled “Tensor Computation for Data Analysis” and “Tensor Regression” published by Springer and Now Publishers, edited a book titled “Tensors for Data Processing: Theory, Methods, and Applications” published by Elsevier, and authored or co-authored over 70 international journal and conference papers. He services as an associate editor for IEEE Signal Processing Letters, and the lead guest editor for the special issue “tensor image processing” in Signal Processing: Image Communication. He has given tutorials for a few international conferences, including ISCAS 2019, SiPS 2019, APSIPA ASC 2019, ICIP 2020, SSCI 2020, and VCIP 2021.