In section 5 we have seen that we can use arithmetic
coding if, after
is processed and
is known, it is easy to compute
and
.
Does this hold for the CTW algorithm?
Fortunately it does!
See the example below.
Example:
Figure 7: Updated path of the weighted context tree for 0100110
followed by a 0 with past
.
Suppose that the source has already generated the sequence 0100110
while the past symbols were
. This resulted in the
weighted context tree in figure 6. In the root of
the context tree we found the coding probability
.
For processing the next source symbol we must compute the coding
probability
.
This is done by (i) incrementing
, (ii) updating
, and (iii) updating
, for all contexts
, i.e. along the path in the context
tree determined by the symbols preceding the next source symbol.
The results of these computations are shown in figure
7.
Doing so we find
in the root of the context tree.
In a similar way
can be determined.
We conclude by stating that the number of operations necessary for
processing all T source symbols is linear in T.
Moreover, since we only need to store nodes in the context tree
that actually did occur, and since for each symbol we can visit not more
than D+1 nodes, the amount of memory needed to compress
grows
not faster than linear in T.