2021 IEEE East Asian School of Information Theory

Tutorials

Tutorials including Q&A will be synchronous events. Questions can be asked during the tutorials, and tutorial moderators will help with asking and addressing questions that have been posted in the chat.

Tutorial A (August 3, 8:50am – noon; August 4, 8:30am – 11:30am)

Prof. Changho Suh, KAIST

Fair machine learning

Moderator: Lingfei Jin, Fudan University

Abstract: We explore an interested topic in the AI field via the lens of information theory: fair machine learning. The fairness is one of the crucial aspects required for enabling trustyworthy AI. In this tutorial, we study how tools of information theory and statistics serve to address fairness issues that arise in the two focused contexts: (i) fair classifiers that aim to achieve the irrelevancy of a prediction to sensitive attributes such as race, sex, age and religion; and (ii) fair generative models that promote fair representation (i.e., ensuring class-balanced generated samples) even when trained with size-biased real samples across distinct demographics.

In the first half of the tutorial, we present two fair classifiers: (i) the one that is inspired by arguably the most prominent information-theoretic notion, mutual informaiton; and (ii) the other that is built upon a well-known statistical method, called kernel density estimation, and offers a better accuracy-vs-fairness tradeoff. In the second half, we first study a generative model, named hinge GAN, which forms the basis of a fair generator to be explored in depth. Building upon a connection with a famous divergence measure, total variation distance (TVD), we then investigate a TVD-based fair generative model that performs well both in sample quality and fairness. Lastly we discuss a couple of other relevant issues.

Bio: Changho Suh is an Associate Professor of Electrical Engineering at KAIST and an Associated Head of KAIST AI Institute. He received the B.S. and M.S. degrees in Electrical Engineering from KAIST in 2000 and 2002 respectively, and the Ph.D. degree in EECS from UC Berkeley in 2011. From 2011 to 2012, he was a postdoctoral associate in MIT. From 2002 to 2006, he had been with Samsung. Prof. Suh is a recipient of numerous awards, including the 2021 James L. Massey Research & Teaching Award for Young Scholars from the IEEE Information Theory Society, the 2020 LINKGENESIS Best Teacher Award, the 2019 AFOSR Grant, the 2019 Google Education Grant, the 2018 IEIE/IEEE Joint Award, the 2015 IEIE Haedong Young Engineer Award, the 2013 IEEE Communications Society Stephen O. Rice Prize, the 2011 David J. Sakrison Memorial Prize (the best dissertation award in UC Berkeley EECS), the 2009 IEEE ISIT Best Student Paper Award, and the four Department Teaching Awards (2013, 2019, 2020, 2021). Dr. Suh is an IEEE Information Theory Society Distinguished Lecturer, the General Chair of the Inaugural IEEE East Asian School of Information Theory, and a Member of Young Korean Academy of Science and Technology. He is also an Associate Editor of Machine Learning for the IEEE Transactions on Information Theory, the Editor for IEEE Information Theory Newsletter, a Column Editor for IEEE BITS the Information Theory Maganize, an Area Chair of NeurIPS 2021 and a Senior Program Committee of IJCAI 2019–2021.

Slides: Lecture 1, Lecture 2, Lecture 3, Lecture 4, Lecture 5, Lecture 6

Tutorial Notes: TN1, TN2, TN3, TN4, TN5, TN6

Tutorial B (August 3, 1:30pm – 4:30pm)

Prof. Vincent Tan, NUS

Common information and non-interactive correlation distillation

Moderator: Hye Won Chung, KAIST

Abstract: We review Wyner's common information (CI) and discuss its extensions to the Rényi CI, showing that the latter provides a bridge between Wyner's CI and the Exact CI, a notion devised by Kumar, Li, and El Gamal. Exploiting this connection, we explain that the Exact CI can be strictly larger than Wyner's CI for some sources. We discuss some related topics such as channel synthesis and nonnegative rank. Finally, we discuss the Gács-Körner-Witsenhausen's (GKW's) CI and its connection to the noninteractive correlation distillation problem. These allow us to make several advances on these topics, including important conjectures and open problems and their partial solutions, e.g., the Courtade-Kumar's conjecture, Mossel's mean-1/4 stability problem and Ordentlich-Polyanskiy-Shayevitz's conjecture.

Bio: Vincent Y. F. Tan (Senior Member, IEEE 2015) was born in Singapore in 1981. He is currently a Dean’s Chair Associate Professor (with tenure) in the Department of Electrical and Computer Engineering (ECE) and the Department of Mathematics at the National University of Singapore (NUS). He received the B.A. and M.Eng. degrees in Electrical and Information Sciences from Cambridge University in 2005 and the Ph.D. degree in Electrical Engineering and Computer Science (EECS) from the Massachusetts Institute of Technology in 2011. His research interests include network information theory, machine learning, and statistical signal processing.

Dr. Tan received the MIT EECS Jin-Au Kong outstanding doctoral thesis prize in 2011, the NUS Young Investigator Award in 2014, the NUS Engineering Young Researcher Award in 2018, the Singapore National Research Foundation (NRF) Fellowship (Class of 2018), and the NUS Young Researcher Award in 2019. He was a Distinguished Lecturer of the IEEE Information Theory Society (20189). A dedicated educator, he also received the NUS Faculty of Engineering Engineering Educator Award in 201920.

He has authored a research monograph on “Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities” in the Foundations and Trends in Communications and Information Theory Series (NOWPublishers). He is currently an Editor of the IEEE Transactions on Signal Processing and the IEEE Transactions on Information Theory. He is a member of the IEEE Information Theory Society Board of Governors. He is on the Faculty (of Engineering) Promotion and Tenure Committee at NUS.

Slides

Tutorial C (August 4, 1:00pm – 4:00pm)

Prof. Lingfei Jin, Fudan University

A brief introduction to (maximal recoverable) locally repairable codes

Moderator: Junil Choi, KAIST

Abstract: Distributed storage systems need to build in redundancy in the data stored in order to cope with the loss or inaccessibility of the data. Traditional erasure codes offer a natural strategy for such robust data storage, with each storage node storing a small part of the codeword, so that the data is protected against multiple node failures. Due to large bandwidth needed to reconstruct data via erasure codes, people introduced locally repairable codes (LRCs). Locally Repairable Codes allow for recovery from a single failure node (or a small number of failures) in a local manner by accessing few other codeword symbols. They have emerged as the codes of choice for large scale distributed storage systems due to the very efficient repair of failed storage nodes in the typical scenario of a single or few nodes failures, while also offering fault tolerance against worst-case scenarios with more erasures. In this tutorial, I will firstly introduce some backgrounds on distributed storage systems, then proceed to some preliminaries on coding theory as well as LRCs. After these preparations, I will present a few explicit constructions on optimal LRCs. Finally, I will introduce the topic of maximally recoverable (MR) LRCs as well as some recent results.

Bio: Lingfei Jin received her Ph.D. degree from Nanyang Technological University, Singapore in 2013. Currently, she is an associate Professor at the School of Computer Science, Fudan University, China. Her research interests include classical coding theory and quantum coding theory.

Slides: Lecture 1, Lecture 2, Lecture 3

Tutorial D (August 5, 8:30am – 11:30am)

Prof. Lalitha Sankar, ASU

Bridging information theory and machine learning: A loss function perspective

Moderator: Byonghyo Shim, Seoul National University

Abstract: Machine learning has dramatically enhanced the role of automated decision making across a variety of domains. There are three ingredients that are at the heart of designing of sound ML algorithms: data, learning architectures, and loss functions. In this tutorial, we focus on loss functions and the role of information theory in understanding the choice of loss functions in learning. To this end, we introduce alpha-loss as a parameterized class of loss functions that resulted from operationally motivating information-theoretic measures. Tuning the parameter alpha from 0 to infinity allows continuous interpolation between known and oft-used losses: log-loss (alpha=1), exponential loss (alpha=1/2), and 0-1 loss (alpha=infinity).

In the first third of the tutorial, we will discuss the classification properties of alpha-loss, its information-theoretic interpretations, and consistency and asymptotic guarantees. In the second part of the tutorial, we focus on a specific model, namely the logistic model, and quantify the optimization landscape of the average loss as viewed through the lens of Strict-Local-Quasi-Convexity. We will also discuss the generalization guarantees as a function of alpha.

Finally, in the last third of the tutorial, we will highlight the robustness of this loss family to a variety of “data twists” and, in particular, its application to enhancing XGBoost (a state of the art boosting technique) through a new algorithm, PILBoost (Pseudo-Inverse Link Boost). We will conclude by highlighting how the core information-theoretic properties of this loss function class allow it to unify a range of generative adversarial network (GAN) models. Here, we will show that a large class of GANs from the original (oft-called vanilla GAN) GAN to f-GANs to Wasserstein and other IPM GANs are captured by using alpha-loss to write the value function of GANs, and thus, have the potential to enable meaningful comparisons of GANs. Throughout the tutorial, the technical results will be accompanied by results on publicly available large datasets and deep learning models.

Bio: Lalitha Sankar is an Associate Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University. She received her doctorate from Rutgers University, her masters from the University of Maryland, and her bachelors degree from the Indian Institute of Technology, Bombay. Her research is at the intersection of information theory and learning theory and also its applications to the electric grid. She received the NSF CAREER award in 2014. She currently leads both an NSF HDR institute on data analytics for the electric grid and an NSF-and Google-funded effort on using learning techniques to assess COVID-19 exposure risk in a secure and privacy-preserving manner.

Slides: Lecture 1, Lecture 2, Lecture 3

Tutorial E (August 5, 3:30pm – 6:30pm)

Prof. Deniz Gunduz, ICL

Semantic and goal-oriented Communications: Information theoretic foundations and applications to machine learning

Moderator: Si-Hyeon Lee, KAIST

Abstract: Traditional digital communication systems are designed to reliably transmit a sequence of bits across a noisy channel. In this approach, bits are treated equally, and the communication system is ignorant of the origin of the bits or how they are used. However, with the advances in machine learning (ML) technologies and their widespread adoption, most communications will take place among machines, where the goal is often not to reconstruct the underlying message at the receiver, but to enable the receiver to make the right inference or take the right action. The main message of this lecture is that the conventional communication protocols can be highly suboptimal for such systems, and an end-to-end design taking into account the “context” or the “goal” becomes essential.

In the first part of this lecture, we will formulate semantic communications as a joint source-channel coding (JSCC) problem and focus on identifying the basic information theoretic performance bounds. We will highlight cases in which a “separation” between compression and communication can be applied without loss of optimality. We will then consider practical implementation of JSCC focusing on wireless image transmission, and show that a deep-learning-aided joint design not only improves upon state-of-the-art separation-based digital schemes, but also achieves graceful degradation with channel quality. In the second part of the lecture, we will focus on distributed ML problems, considering both inference and training stages. We will argue that these can also be treated as JSCC problems with different end-to-end objectives. We will then propose novel JSCC schemes, including “over-the-air computation” for wireless edge learning, and show that JSSC is essential to meet the latency requirements of modern ML applications.

Bio: Deniz Gündüz received his Ph.D. degree in electrical engineering from NYU Tandon School of Engineering (formerly Polytechnic University) in 2007. He is currently a Professor of Information Processing in the Electrical and Electronic Engineering Department of Imperial College London, UK, and a part-time faculty member at the University of Modena and Reggio Emilia, Italy. Previously he served as a postdoctoral research associate at Princeton University, as a consulting assistant professor at Stanford University, and as a research associate at CTTC in Spain. He has held visiting positions at University of Padova (2018-2020) and Princeton University (2009-2012). He is a Distinguished Lecturer for the IEEE Information Theory Society (2020-22). He is the recipient of the IEEE Communications Society - Communication Theory Technical Committee (CTTC) Early Achievement Award in 2017, a Starting Grant of the European Research Council (ERC) in 2016, and several best paper awards. He serves as an Area Editor for the IEEE Transactions on Communications, the IEEE Transactions on Information Theory, and the IEEE Journal on Selected Areas in Communications (JSAC) Special Series on Machine Learning in Communications and Networking; and as an Editor for the IEEE Transactions on Wireless Communications. His research interests lie in the areas of wireless communications, information theory, machine learning, and privacy.

Slides

Goldsmith Lecture (August 6, 8:30am – 11:30am)

Prof. Yuejie Chi, CMU

Non-asymptotic statistical and computational guarantees of reinforcement learning algorithms

Moderator: Vincent Tan, NUS

Abstract: Reinforcement Learning (RL) has received a growing amount of attention in recent years. However, achieving efficient RL in sample-starved situations remains challenging, where data collection is expensive, time-consuming, or even high-stakes (e.g., in clinical trials, online advertising, and autonomous systems). To tackle this challenge, understanding and enhancing the non-asymptotic sample and computational efficiencies of RL algorithms are in imminent need. This tutorial presents a coherent framework that covers recent algorithmic developments for RL, highlighting non-asymptotic statistical and computational guarantees of three distinctive approaches  —  model-based RL, Q-learning, and policy optimization  —  as well as information-theoretic and algorithm-dependent lower bounds.

Bio: Dr. Yuejie Chi is a Professor in the Department of Electrical and Computer Engineering, and a faculty affiliate with the Machine Learning Department and CyLab at Carnegie Mellon University. She received her Ph.D. and M.A. from Princeton University, and B. Eng. (Hon.) from Tsinghua University, all in Electrical Engineering. Her research interests lie in the theoretical and algorithmic foundations of data science, signal processing, machine learning and inverse problems, with applications in sensing and societal systems, broadly defined. Among others, Dr. Chi received the Presidential Early Career Award for Scientists and Engineers (PECASE), the inaugural IEEE Signal Processing Society Early Career Technical Achievement Award for contributions to high-dimensional structured signal processing, held the inaugural Robert E. Doherty Early Career Development Professorship, and was named a Goldsmith Lecturer by IEEE Information Theory Society.

Slides