1995 Plenary Lecture
Note: This material was originally presented at the 1995 Information Theory Symposium and has since been published in a special issue of the Russian journal Problems of Information Transmission dedicated to Mark Pinsker on his 70th birthday. Many thanks to Jacob Ziv and to Plenum Publishing Corp. for granting us permission to re-print this article.
Inequalities and Algorithms for Universal Data Compression
Jacob Ziv

Department of Electrical Engineering
Technion, Haifa, Israel
We describe non-asymptotic uniform bounds on the performance of data-compression algorithms in
cases where the length N of the training sequence (``history") that is available to the encoder is
not large enough so as to yield the ultimate compression, namely the entropy of the source. Two
characteristic ultimate goals are considered: The
-th order entropy
, and the
associated conditional entropy
. The bounds are based on
classical information-theoretic convexity arguments. Nevertheless, it is demonstrated that convexity
arguments that work for some cases are totally useless for others. Furthermore, these classical
convexity arguments, when properly used, lead to efficient universal data compression algorithms for
each of the considered cases.
We limit the discussion to Fixed-to-Variable, uniquely decipherable codes.
We consider Fixed-to-Variable universal encoders with a limited training data
that noiselessly compress
blocks of some fixed length
and derive bounds on the rate of approach of
the compression to the
th order entropy, and to the smaller conditional
entropy, as a function of
and of the length N of the training sequence.
Let
For stationary sources, the
-th order per letter entropy is:
and
Now, let
be the length function of
(namely, the
number of bits that represent
). Then [1]:
Assume that only the
-th order statistics of the information source are known to the encoder.
Then, equipped with this information, we can establish the following Coding Theorem.