next up previous
Next: One unknown parameter Up: No Title Previous: Redundancy

Arithmetic coding

  Assume a lexicographical ordering over the sequences. Now let tex2html_wrap_inline2767 be the cumulative probability of tex2html_wrap_inline2693 . Then associate to a source sequence tex2html_wrap_inline2693 the source interval tex2html_wrap_inline2773 . Note that all source intervals tex2html_wrap_inline2775 are disjoint, and that their union is [0,1).

Also to a codeword there corresponds an interval. To codeword c having length L there corresponds an interval tex2html_wrap_inline2783 , where .c is the number obtained by considering c as a binary fraction. Note that a codeword can only be the prefix of another codeword if its code interval contains the code interval of the other codeword.

The idea of arithmetic coding is to choose for a given source sequence tex2html_wrap_inline2693 the codeword tex2html_wrap_inline2703 with a code interval tex2html_wrap_inline2793 inside tex2html_wrap_inline2775 . This is obtained by taking

eqnarray146

where tex2html_wrap_inline2797 is the smallest integer tex2html_wrap_inline2799 . This follows from

eqnarray153

in which we simplified the notation a little bit.

Example:Let T=2 and consider again a memoryless source with tex2html_wrap_inline2753 . The source intervals, codewords, and redundancies are put in the table.

tabular164

Since tex2html_wrap_inline2817 we may conclude that arithmetic coding achieves codeword lengths within 2 bit from the ideal codeword length, or tex2html_wrap_inline2819 for all tex2html_wrap_inline2693 . This implies also that tex2html_wrap_inline2823 or tex2html_wrap_inline2825 .

For the cumulative probability we can write (Elias):

equation174

This makes it easy to compute this probability (and tex2html_wrap_inline2827 ) if, after knowing tex2html_wrap_inline2829 and tex2html_wrap_inline2831 we can easily compute tex2html_wrap_inline2833 and tex2html_wrap_inline2835 .

Example:For T=3 and a memoryless source having tex2html_wrap_inline2753 we get e.g. tex2html_wrap_inline2841 and tex2html_wrap_inline2843 .

If we use, instead of the actual distribution tex2html_wrap_inline2827 , an arbitrary coding distribution tex2html_wrap_inline2847 , we obtain codeword lengths for which

  equation206

We say that the coding redundancy is smaller than 2 bits. Note that it is necessary that tex2html_wrap_inline2851 for tex2html_wrap_inline2693 such that tex2html_wrap_inline2855 .

Note that arithmetic coding achieves codeword lengths that are very close to what we want (i.e. tex2html_wrap_inline2857 ) and that no tables are needed to store the code (the codeword is computed from the source sequence and vice versa). Issues regarding implementation of this technique are considered e.g. by Rissanen[4] and Pasco[3]. We now have to concentrate on constructing good coding distributions.


next up previous
Next: One unknown parameter Up: No Title Previous: Redundancy

Ramesh Rao
Fri May 2 08:36:00 PDT 1997