The IEEE Information Theory Society Thomas M. Cover Dissertation Award, established in 2013, is awarded annually to the author of an outstanding doctoral dissertation contributing to the mathematical foundations of any of the information sciences within the purview of the Society including, but not limited to, Shannon theory, source and channel coding theory, data compression, learning theory, quantum information theory and computing, complexity theory, and applications of information theory in probability and statistics.
Anuran Makur’s thesis “Information Contraction and Decomposition” was completed in June 2019 in the EECS Department at MIT under the joint supervision of Lizhong Zheng and Yury Polyanskiy.
Anuran Makur received a B.S. degree with highest honors (summa cum laude) from the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley (UC Berkeley), CA, USA, in 2013, and the S.M. and Sc.D. degrees from the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology (MIT), Cambridge, MA, USA, in 2015 and 2019, respectively. He is currently a postdoctoral researcher at the Laboratory for Information and Decision Systems at MIT. His research interests include theoretical statistics and machine learning, information theory, and other areas in applied probability. He was a recipient of the Arthur M. Hopkin Award from UC Berkeley in 2013, the Irwin Mark Jacobs and Joan Klein Jacobs Presidential Fellowship from MIT in 2013, the Ernst A. Guillemin Master's Thesis Award from MIT in 2015, and the Jin Au Kong Doctoral Thesis Award from MIT in 2020.
Information contraction is one of the most fundamental concepts in information theory as evidenced by the numerous classical converse theorems that utilize it. This dissertation studies several problems aimed at better understanding this notion, broadly construed, within the intertwined realms of information theory, statistics, and discrete probability theory.
In information theory, the contraction of f-divergences, such as Kullback-Leibler (KL) divergence, χ2-divergence, and total variation (TV) distance, through channels (or the contraction of mutual f -information along Markov chains) is quantitatively captured by the well-known data processing inequalities. These inequalities can be tightened to produce “strong” data processing inequalities (SDPIs), which are obtained by introducing appropriate channel-dependent or source-channel-dependent “contraction coefficients.”
The thesis first proves various properties of contraction coefficients of source-channel pairs, and derives linear bounds on specific classes of such contraction coefficients in terms of the contraction coefficient for χ2-divergence (or the Hirschfeld-Gebelein-Rényi maximal correlation). Next, the thesis adopts a more statistical and machine learning perspective in elucidating the elegant geometry of SDPIs for χ2-divergence by developing modal decompositions of bivariate distributions based on singular value decompositions of conditional expectation operators.