next up previous
Next: Hypothesis testing: exponential rate Up: Information theoretic methods in Previous: Information theoretic methods in

Early results

Let us start with Wald's inequality relating the expected sample size of a sequential test to the type 1 and type 2 error probabilities.

Assuming i.i.d. sampling from a distribution known to be either P or Q , a sequential test accepts one of these hypotheses on the basis of a sample tex2html_wrap_inline900 of random length N. Here N is a stopping time, i.e., knowledge of tex2html_wrap_inline786 determines whether or not N=n. Wald (1947) proved that

  equation209

where tex2html_wrap_inline910 is the probability under P of accepting Q and tex2html_wrap_inline916 is the probability under Q of accepting P. Moreover, Wald showed that his sequential probability ratio test nearly attains the equality in (3.1).

The IT interpretation makes this result easy to understand: Denoting by tex2html_wrap_inline922 and tex2html_wrap_inline924 the distribution of tex2html_wrap_inline926 under P and Q, the left hand side of (3.1) equals tex2html_wrap_inline932 (this can be checked using Wald's identity), whereas the right hand side is the I-divergence of the tex2html_wrap_inline526 -quantizations of tex2html_wrap_inline922 and tex2html_wrap_inline924 for tex2html_wrap_inline940 , tex2html_wrap_inline942 and tex2html_wrap_inline944 being the acceptance regions of P and Q. Were the likelihood ratio constant on both tex2html_wrap_inline942 and tex2html_wrap_inline944 , the equality would hold in (3.1). While no test can achieve this exactly, in general, the sequential probability ratio test comes close.

Another early result in statistical IT is the celebrated ``Stein's lemma'' (Chernoff 1952; Stein apparently disowns it). It provides an operational meaning to I-divergence: For testing a simple hypothesis P against a simple alternative Q, the best test of sample size n and type 1 error probability tex2html_wrap_inline960 (for any tex2html_wrap_inline962 has type 2 error probability tex2html_wrap_inline964 . Notice that if the type 1 error were required to go to zero, rather than just tex2html_wrap_inline960 , the special case tex2html_wrap_inline968 of Wald's inequality (3.1) would already imply that the type 2 error probability exponent can not exceed tex2html_wrap_inline534 .


next up previous
Next: Hypothesis testing: exponential rate Up: Information theoretic methods in Previous: Information theoretic methods in

Ramesh Rao
Mon Apr 6 16:41:42 PDT 1998