next up previous
Next: Sliding Window Universal Data Up: No Title Previous: No Title

Coding Theorem 1:

By applying the Huffman coding algorithm [1], we have

  eqnarray441

Let

  eqnarray444

Then

  equation449

Here again,

  equation452

Thus, by conditioning on the past K letters one may achieve a better compression, if a different Huffman code-book is used for each tex2html_wrap_inline657 , namely

  eqnarray455

In many cases, even the tex2html_wrap_inline627 -th order statistics of the source are not available. This situation calls for universal data-compression algorithms, that utilize a ``training-sequence" of limited length which was either emitted by the same source, or is a curtailed, partial description of the statistics of the source.

The first introductory case to be considered is where a sequence of n-letters is to be compressed.

Let

  equation458

and let

  equation461

(Empirical prob. measure).

Now, by the convexity of the entropy function

  equation466

Now [2, 3], for small enough tex2html_wrap_inline627 (as compared with n) one can empirically evaluate tex2html_wrap_inline667 for each tex2html_wrap_inline627 vector and then apply the appropriate Huffman algorithm for consecutive tex2html_wrap_inline627 -blocks. (The description of tex2html_wrap_inline673 takes about tex2html_wrap_inline675 bits). The length function may therefore be upper-bounded by:

  equation469

where:

a)
It is assumed that tex2html_wrap_inline627 divides n and tex2html_wrap_inline681 .
b)
tex2html_wrap_inline683
Thus:

  equation472

However, in this introductory case the compression tex2html_wrap_inline629 is achieved at the cost of a coding delay of n letters. We now introduce a more fruitful approach:



Ramesh Rao
Thu Sep 19 17:06:16 PDT 1996