Cornell University
Library
Cornell UniversityLibrary

eCommons

Help
Log In(current)
  1. Home
  2. Cornell University Graduate School
  3. Cornell Theses and Dissertations
  4. Information Theory With Large Alphabets And Source Coding Error Exponents

Information Theory With Large Alphabets And Source Coding Error Exponents

File(s)
bgk6.pdf (1.14 MB)
Permanent Link(s)
https://hdl.handle.net/1813/30614
Collections
Cornell Theses and Dissertations
Author
Kelly, Benjamin
Abstract

The early twenty-first century has been referred to as the 'information age'; this appears to be an apt name given the massive amounts of data that are created daily. However, the utility of the data is limited by the tools we possess to extract and manipulate the information contained within. Towards this end, in this thesis we examine some problems concerning classification and communication. The first problem examined is that of classification in a 'large-alphabet' regime. In this large-alphabet regime, which is motivated by natural language, standard statistical approaches to classification based on chi-squared tests or maximum likelihood are inconsistent. We derive the limit (in terms of alphabet growth rate) beyond which consistent classification is impossible and propose a new consistent test that achieves this limit. We also propose a new classifier which has good empirical performance. The second problem addressed concerns compression of sources with large alphabets. We first characterize for which alphabet growth rates is universal compression possible. We then study the permitted alphabet growth-rate in the non-universal case in which the goal is to compress a source generated by a known sequence of distributions. We finally examine error exponents for source coding/compression problems. The error exponent characterizes the optimal exponential decay of the error probability. For the cases of the Wyner-Ziv and source coding with side information problems we provide new upper and lower bounds on the error exponent. These bounds match for some special cases. We also make connections between source coding error exponents and graph theory and provide new upper bounds on Witsenhausen's rate and complementary graph entropy, two useful quantities from graph theory.

Date Issued
2011-08-31
Keywords
Information Theory
•
Hypothesis Testing
•
Classification
•
Error Exponents
Committee Chair
Wagner, Aaron B.
Committee Member
Tong, Lang
Nussbaum, Michael
Degree Discipline
Electrical Engineering
Degree Name
Ph. D., Electrical Engineering
Degree Level
Doctor of Philosophy
Type
dissertation or thesis

Site Statistics | Help

About eCommons | Policies | Terms of use | Contact Us

copyright © 2002-2026 Cornell University Library | Privacy | Web Accessibility Assistance