In fact, the field has been declared dead again and again. I think that there is a moral in that, because each time that it's declared dead, something new comes along.
My recollection is that information theory was declared dead at MIT in the Sixties specifically because we knew how to get to channel capacity, or at least to "practical" channel capacity. You got to R_0 (which we then called R_comp) by using sequential decoding.
In my first practical work at Codex Corporation I worked on a power-limited additive white Gaussian noise channel, the deep space channel. We eventually implemented sequential decoding on that channel, and we basically thought that that was the end of the story. Of course since then there have been a few developments. Little things like the Viterbi algorithm.
In fact, the Viterbi algorithm, with a very long constraint-length convolutional code, and with sophisticated concatenated coding, was what was used to save the Galileo mission recently. So, even on the power-limited white Gaussian noise channel the field was not dead then, and it has continued to evolve over all these years.
We can now recognize that we really knew how to approach capacity then only on power-limited Gaussian channels. On the bandwidth-limited Gaussian channel, there have been tremendous developments in recent years, and I would assert that only in the past few years have we really learned how to approach the capacity of the bandlimited Gaussian channel.
The essential breakthrough from a coding point of view was of course trellis codes. Trellis codes were completely unanti- cipated until Ungerboeck invented them. They led to a whole field of Euclidean-space coding that we had several sessions on at Trondheim, and that is a very active field of research today.
We still do not have good methods of constructing complicat- ed trellis codes, so that is a very good open area for research.
Then there was a little thing called shaping that was not even recognized as a problem until the past couple of years. What is shaping? Ordinarily with trellis codes you use a QAM constellation with a uniform probability distribution over the points. But a uniform distribution loses one and a half dB compared to a Gaussian distribution over the signal points. So, somehow you have to develop a method for get- ting a non- uniform distribution over the points in your signal constellation. These methods collectively are known as shaping. Shaping really has only been worked on in the last five years. It is a wholly new topic, but there is not too much to it. I do not recommend that graduate students go into it, but it just illustrates that there can be new aspects to what was thought to be an old problem.
All of this is still for ideal Gaussian channels. For a real Gaussian channel like the telephone-line channel, you have to deal with intersymbol interference, with equalization, with the fact that the channel does not have an ideal brickwall response. Methods for doing equalization in combination with powerful coding and shaping have been developed extremely recently and are a very actively developing area in communications.
Precoding is really about how do you realize Shannon's prescription of doing water pouring across the channel. First of all, you have to measure the channel, so you have to develop line probing techniques to measure what is going on across the channel. You really need to know the signal- to- noise ratio as a function of frequency. Then, how are you going to combat this? There is a relatively straightforward way that you would think of, for instance, if you read Gallager's text, which is to slice the channel up into narrow little bands and do multi-carrier modulation, with individual modulation in each of the bands. That is not a bad approach for some applications, but on the telephone- line channel, for instance, it simply involves too much delay for the kinds of computer applications that modems are used for.
So, what is a single-carrier way of doing this? Well, various techniques collectively known as precoding (but different from other kinds of precoding that you know about) have been developed.
Many of you may know that there is has been a telephone-line modem standard under development in the past three years, originally called V.fast but now just ratified three weeks ago under the number V.34. During the course of that development, I believe that there were something like four different versions of precoding, each one better that the last, each one building on the last. Rajiv Laroia had a pa- per at Trondheim on the final version that is in the standard. So there have been tremendous developments in the area of doing combined coding, shaping and equalization.
Theoretically, this raises a lot of interesting questions. I think that one of the lessons of my experience in this field has been the interplay between theory and practice, and how practical advances lead to interesting hard questions for information theory. This precoding algorithm that I have mentioned is based on the idea of decision-feedback equalization, except it puts the decision feedback in the transmitter in a certain way. Now, there are various theorems of one degree of rigor or another which indicate that all you need to get to channel capacity is decision- feedback equalization. You do not need to do maximum- likelihood sequence detection to get to channel capacity. I do not believe that anybody has made this proposition rigorous. All of the arguments basically use the assumption that you make in decision feedback that all decisions are correct. Under that ideal assumption, in fact, you can show that you can sometimes do better than capacity. So, to develop a really solid proof that decision-feedback- equalization performance is sufficient to get to capacity, at least on high-SNR channels, would be an example of an in- teresting theoretical problem that is directly suggested by these re- cent practical advances.
I could say a lot more on the same theme about open ques- tions in Euclidean-space coding, about connections with com- plexity, connections with system theory, and maybe I will as the discussion goes on. But, the overall message that I would like to leave with you is this. About every ten years I hear the statement that information theory is dead. At the same time, practical people are off working away trying to come even close to realizing what information theory says is possible. Great advances have been made, and they in turn have thrown back rich new questions for Shannon theory people to work on. So, I just view the field as following its nose in the way that a discipline does, and I see it as being as healthy now as I have ever seen it.
It may be true that in the Seventies more activities were going on. It was then that multiuser information theory was developed and that lead to large activities. Those came to an end when one solved those problems which were not too difficult to solve and then one had to look for something else. Now, it is definitely true that in the last ten years there were many very interesting propositions, many in- teresting new ideas. Perhaps they were not followed through in such detail as they would have been in the Seventies, because there were not enough people working on the field. But, the problems were there, and there is still time to continue and work on them.
Let me just mention a few such problems which are known to me, which are close to me. I do not want to claim in any manner that those are the most important prob- lems. There may be others that are similar or even more impor- tant. There was, for example, the idea of hypothesis testing and statistical estimation using communication over channels of finite capacity. This is the problem of multiterminal detec- tion, multiterminal estimation. Then there was the advance of Shannon theory methods in secrecy problems. The wiretap channel, the idea of generating a secret key using a public channel, and privacy amplification are certainly very in- teresting. Then there was the identification capacity prob- lem, a completely new field which nobody ever thought would come into existence and it lead to very nice developments. And the most recent one, the so-called resolvability prob- lem, which involves the com- plexity of generating randomness, a completely new kind of ap- proximation problem which Sergio Verd' u and Te Sun Han are doing with great success. These are just a few new directions which in my opinion are very important.
Now, I think it is even more important that information theory has very useful applications within Mathematics itself. I have considered in my life a very important task to convince mathematicians that information theory is interesting and it is hardcore mathematics. I think it is demonstrated now beyond any doubts. Let me mention a few examples.
At one time it was declared that ergodic theory was dead. And now ergodic theory is flourishing, thanks to a great extent to information theory. It was Kolmogorov's idea that entropy can be used to study isomor- phism of transformations that led to to the present very big developments in ergodic theory. As another example, take the statistical applications of information theory. The very word information is a basic word for Statistics. Statisticians always want to get information from the data. Still, until recently this was considered a completely dif- ferent kind of in- formation than we had in mind. So the techniques or methods of information theory were not particularly relevant for statisti- cal inquiry. Even though the works of Kullback were there. But that was somehow isolated. But now we have the Maximum Entropy and Minimum Description Length methods which certainly appear very powerful. I think they give a very substantial contribution to statistics. Another field which is important in communications, but even more in computer science, is combinatorics.
You may know that my friend Janos Korner was supposed to sit here, unfor- tunately he could not come. I wish he could talk about the subject himself, he developed a very powerful method of dealing with certain combinatorial problems using quite deep techniques of information theory. Using those techniques he and his colleagues were able to solve famous unsolved prob- lems posed maybe twenty years ago and in combinatorics this is quite a substantial new development.
I just wanted to mention those things to emphasize that as a scientific theory, information theory is very much alive. Of course its future will depend on the young people. If enough talented young people will come to the area then they will continue this work and the field with contiunie to flourish. I am not qualified to talk about the communication theory applications which were the original applications of infor- mation theory. I understand they are still the most impor- tant ones to other people. Dave Forney has explained how important developments have appeared re- cently in that direc- tion. I just wanted to emphasize that also in the theoreti- cal field there were some very important developments and that there is substantial hope for much more. Thank you.
Some people think it is information theory only if it has something to do with applications. It is very important to think about applications, but it is also very important to understand that information- theoretic research is a very im- portant mathematical area.
I will focus my answer on Shannon theory as it applies to the search for fundamental limits in source and channel coding. By their very nature, the answers to those questions are technology-independent, which does not mean that in our work we disregard the technology of the day. On the con- trary, the technological state-of-the-art has always been the main driving force in selecting research problems, and moreover, technology is on our side because it enables us to approach those fundamen- tal limits closer and closer as time progresses.
Some people question whether information theory has anything practically relevant to say in modern communication systems. It would be wrong to assume that technological advances rapidly render any channel model obsolete, and that therefore it is irrelevant to try to push as much information through a channel as possible, or even to find its capacity. Consider, for example, wireless, deep-space and telephone chan- nels. Indeed, channels whose bandwidth and/or signal-to- noise ratio are precious resources are the norm rather than the exception.
Were we not able to come close to reaching capacity or source entropy rates with reasonable complexity, then the practical relevance of Shannon theory would indeed be debatable. However, the practical application of our field has produced stellar success stories such as the Ziv-Lempel algorithm, coding in space communication ("a marriage made in heaven" in the words of Jim Massey), and the telephone chan- nel, where after fifty years, the limits predicted by Shan- non have been almost reached by commercial modems. That lesson is and will be valid in many other examples. Maybe the day will come when a software package will enable the engineer to closely approach the capacity of almost any channel with the technology of the day. Admittedly, I am afraid it is us who will be dead when that day arrives!
It is more constructive to address the question: What can be done to sustain and enhance the vitality of the field?
To this end, consider the fact that contributors to informa- tion theory come from a wide variety of disciplines: Engineers, statisticians, mathematicians and ergodic theorists, physicists, etc. As far as the mathematicians, statisticians and ergodic theorists are concerned, I am very optimistic as they constitute a body of strong researchers for whom information theory is indeed an application. I think we will always get exciting theoretical contributions from them. This is particularly true as new light is shed on interplays between these disciplines and information theory.
This brings us to the engineers who constitute a critical component. Information theory got off to a flying start thanks to a group of engineers who have also been responsible for some of its subsequently stunning success stories. We must ensure that new generations of engineers are motivated into embracing the discipline. And it is here, I believe, that significant reinforcement is needed. Emerging technologies call for a modification of some of our favorite paradigms in information theory. While this development surely does not detract from the significance of existing paradigms, there is clearly a need for new ones which address the fundamental limits of performance of emerging systems which are generally poorly understood. For instance, communication networks, e.g., those offering multi-media services, cannot be adequately addressed based on our usual paradigms involving simple sources and a com- munication link. Indeed, network information theory -- to the extent it has developed -- is inadequate too. Nor does a theory of queue- ing networks suffice. I believe that information theorists can play a very significant role in this field, and we have seen a few recent papers in this context. It is also true that most of the models we have studied thus far are fairly simplistic in the context of a network. There is a real need for tractable and realistic models of information sources as also network situations involving delays, asynchronism, etc. There have been some promising beginnings in this regard, but sustained coherent efforts by more people are need- ed. In summary, the engineering side of information theory needs bolstering.
I would like to say a few remarks about my own career and my own life in order to show my perspective on these things. First of all I am an engineer, secondarily I may be a physicist, and only thirdly a mathematician. So, I see things from a different perspective than those people in the room who are mathemati- cians.
When I was much younger, I looked around rooms such as this, or looked around the world, and I saw that information was everywhere. Information is in every corner of this room. Chemical and biological informa- tion is flowing through our bodies. There are microwaves passing through the room that are carrying information. There are radio signals going through the room carrying information. Besides the information that is embedded into those signals intentionally, there is information about reflections and the propagation path and so on. The air molecules contain information about the environment. There are complex patterns of communications among the panel and the audience. These are multiway communications that we do not understand.
As I looked at such things, I saw something about the future that is evolving into the era of information. So this is the direction I went in my career. I nev- er really looked back, nor regretted it in the least. It has taken me into many exciting problems and many exciting to- pics and many exciting places. I do not think that there is any other career that I could have chosen that touches on algebraic geometry and physics and magnetic recording and integrated cir- cuits and deep space communications and radar design and economics and many other things. Information theory touches many topics and it is a rewarding place in which to spend one's life.
If we look back to the nineteenth century, we now think of it as the age of energy. This was the time when technologists developed steam engines and automata and devices. In the background of that whole culture developed the subject of thermodynamics. I am sure that thermodynamics was developed by a small group of people as the intellectual underpinnings of the age of energy. It was not developed by the world at large. I am sure that that small group of people felt in some ways insulated from the engineers at large. But, nevertheless, thermodynamics is a very important, very successful subject. I see infor- mation theory is to the age of information as thermo- dynamics is the age of energy.
We are creating something that touches on a lot of topics and it is important in that way. I have always been happy working in this field and, and as I say, it has taken me many places. In some ways I see some regret that we have a tendency to identify or isolate very narrowly de- fined problems, very difficult problems and then continue to work on those problems. Maybe idealizing or abstracting those problems to such a degree that those idealizations are no longer useful.
I was thinking about an article that Peter Elias wrote. Peter Elias wrote an edi- torial in the Information Theory Transactions about thirty years ago, a one page editorial. My memory says the title of that article was "Photosynthesis, Information Theory, and Religion." In that article, Peter Elias warned about writ- ing papers that are very abstract papers on very arcane subjects that are not necessarily related to any- thing at all. I think we have a tendency to do that.
As several people have already said here, the development of waveforms for the additive Gaussian noise channel has been an absolutely remarkable story over the last twenty years or so. In Shannon's paper, already he had computed a formula or given a formula for the capacity of a bandlimited white noise Gaussian channel. But at that time the point of view was that this was a very theoretical formula by Shannon and it really did not have any relevance to our communication indus- try. Yet after twenty or twenty-five years, we have now essentially achieved the capacity of the additive white Gaussian noise channel. We know how to do that and modems now come close to achieving the capacity and are available in the marketplace. It is a very impressive success story. But, as we worked on that channel there are other channels of a practical nature that I think we somehow have neglect- ed.
We like to study discrete channels, but there are con- tinuous channels with continuous time that are not well understood. The hard limited passband channel is an elementa- ry model of the magnetic recording channel. The channel in- puts are waveforms that can only take values of plus and minus 1. Those functions are viewed at the output of the channel after being passed through a filter. So the waveforms themselves are not observed, but a filtered ver- sion in additive Gaussian noise. We still do not know the capacity of that simple channel. We do not know how to achieve the capacity of that channel, although it seems we are rapidly working towards achieving its capacity without knowing what its capacity is. Information theory is now making very substantial strides towards very sophisticated waveforms for magnetic recording and optical recording.
Another topic that I have interested myself in lately is algebraic geometry codes, trying to understand these codes from an engineering point of view. The nature of our subject is that the period of gestation between the theoretical work and the time it reaches some useful application is measured more on the order of two or three decades than on the order of months. This leads some outside observers to view what we are doing as not necessarily relevant in the world because such observers fail to take a long enough view to see over a period of decades. I will take a wager with anyone in the room, a wager of $1,000 right now, that the Hermetian codes are going to be widely used in an applica- tion within twenty years from today. I am not a gambling man, but I don't regard this as a gamble. I propose it to indicate my view that these things are developed over the long term. The area of noiseless data compression, or I like to call it data compaction, the Lempel-Ziv algorithm, reached journal stage about fifteen years ago but it took years for people to actually notice it and put it into ap- plication. Now, there are companies whose only product is Lempel-Ziv data compression algorithms. So, to see the future, to take whatever we are doing today, we have to see the outcome of that at some distant time in the future.
Certainly, one of the main discoveries of Shannon was that the amount of information can be measured without reference to the meaning of the information. Therefore it is perhaps not too wrong to call Shannon Theory that field, all those fields, where those information measures play an important role. In this sense coding theorems for sources, for channels and multiuser networks certainly fall into this field, but so do applications of information theory outside communication like MDL or maximum entropy techniques in statis- tics, or applications to large deviation theory in probability theory, or to the theory of Gibbs fields.
Some people might find this definition too broad, but at least that is a possible point of view. Of course I do not think it is a very basic question what kind of terminology we adopt. It is more important what kind of problems we are doing. Now the next part of the question would be what are the problems which we consider important and what failures do we see. Again for me this means something else than probably for the majority. I am not very familiar with the applied field. I was not aware that now we are able to achieve channel capacity by algorithmic methods in some very real situations. But I am happy to learn this fact.
From the purely mathematical point of view, there are many problems, very basic problems, which could not be solved so far. I do not think that is a failure, it is probably due to the very difficult nature of those problems. For example, a fundamental problem of coding theory is to determine the maximum number of codewords with a prescribed minimum distance, at least asymptotically. This very basic problem appears extremely hard to solve, even though, perhaps, the algebraic geometry codes provide a tool to get closer to the solution. There are basic problems about zero error capacity. Nobody has any means of computing the zero error capacity of a channel even in principle, except for special cases, although zero error capacity is a more basic concept than ordinary capacity.
Clearly, one would be more interested in transmitting information without any errors than with a small probability of error. In certain devices it is impossible, but when it is possible to transmit without errors, one would like to know what rates are achievable. Many people have worked on this problem and the answers are still far away. It often helps to give new formulations and design new problems which perhaps look more general and more difficult. Sometimes the more general problem will be less difficult to solve. I think it is important to work even on such questions which may look hopeless to solve completely. Small steps forwards may also be valuable.
For example, I just mention one problem on which I have worked recently, that is the problem of channel capacity for a specified decoding metric. There is no hope or very little hope to solve this problem completely because this would include the solution of the zero error capacity problem. That would be a great achievement, but even if this cannot be done, any partial results may bring us closer to understand the difficulties or the workings of such mathematical problems. Of course there are many, many difficult open problems in multiuser information theory. This morning we have heard a talk in which the solution of one of those problems was announced, and I would be very happy if this solution would turn out to be correct. There are plenty of problems in statistics, in combinatorics which have an information theoretical flavor. An appropriate formulation often leads to good answers. Very often we cannot answer problems because we are not able to formulate them properly.
I would like to emphasize, I do not consider information theory a part of communication theory. It is much broader. Communication is just one field of applications. But there are other fields, statistics, physics, biology, genetics (as we saw today) where information measures can turn out to be useful. So, I think it is important to view the field in this generality.
I sense a "rennaisance of noise" in our field. Some time ago, some of our best minds moved away from physical-channel research, which at some point may have been viewed by some as old- fashioned. I think the pendulum is swinging back under the influence of, above all, wireless communications.
Wireless has brought multiuser information theory to the forefront once again. Lately, we have heard very heated arguments as to whether CDMA has higher capacity than TDMA and so on. Those arguments typically involve a dose of information theory mixed with some quasi-religious fervor. (Most of us in multiuser information theory are partial to CDMA, and not just be- cause it keeps us busier than TDMA or FDMA.) But we should not blame the practitioners for the lack of sophistication of some of their arguments. By and large, the development of mul- tiuser information theory did not capture until recently some of the essential issues that must be faced in real CDMA channels. The Cover-Wyner region was really all that was known for Gaussian multiple access chan- nels and that simple model did not come close to modeling CDMA. A widely-held misconception about the capacity region of more general multiaccess channels existed until Gallager came up with the right answer in 1988. Since then, the capacity of various CDMA and cellular channels has been found.
Networks is, I think, another case where the expectations have not materialized. In the late Seventies- early Eighties, there was a lot of hope for "network information theory". Today there are results that apply to some of the building blocks in wireless networks but there really is no such thing as network information theory. Perhaps there will never be.
I sense a decreasing reliance on combinatorics in information theory. We often admire (or become thoroughly frustrated with) the virtuoso combinatorial proofs for some very deep results within very specific classes of discrete memoryless channels. But, I would go as far as saying that none of the flagship results in information theory are real- ly combinatorial in nature. Perhaps, nowadays, the proba- bilistic school that does not rely on combinatorics is gain- ing ground. And that is salutary: sometimes in order to get to the heart of a result it is better to actually put it in a general framework of general channels (no just discrete memoryless channels) and sometimes that even leads to simpler proofs because you have fewer tools.
There are several questions crying out aloud for answers. For instance, high-speed networks will carry a combination of vast amounts of voice, data and video. This information must be compressed either losslessly or lossily in an effi- cient manner which is in conformity with network protocols. What are good mathematical models for such sources? Are statistics-based compression schemes the best choice? How do we characterize the fundamental limits of compression for such sources?
Another class of problems arises in wireless. Given that we're all expected to ultimately have cellular units glued to our persons, we surely have a personal stake in this area! I am told that the compound channel is too optimistic a model for the wireless link whereas the arbitrarily varying channel is too pessimistic. Is there then a better model with a capacity which can be evaluated? What happens to this channel in a network context? What are good coding and modulation formats for this application? The CDMA/TDMA debate has already been alluded to.
Next, there is a growing recognition of the fact that satellites will play a key role in networks which heretofore were considered to be largely terrestrial. The introduction of satellites raises several interesting issues, one of them being multicasting. Simply put, multicasting involves the transmission of the same information to many receivers with different reception capabilities. From an information theoretic standpoint, the problem is one of joint source- channel coding, with the source coding part being hierarchical in nature. There is also a (hierarchical) source coding problem with constraints being imposed by allowable network link rates. Yet another issue concerns the use of the satellite link as an alternative to terrestrial routes, e.g., when the latter are congested or fail. The inherent ``asymmetry" of the satellite link (owing to propagation delays and reduced speeds) in an otherwise terrestrial network raises a host of interesting problems.
Sergio has pointed out that multi-user information theory has not quite kept up with the development of networks. I do not know whether this is due to the fact that the problems, as formulated by us, are just too difficult, or whether new paradigms are needed. On a slightly different topic, Gallager -- in the late seventies -- published a paper on the information content of protocols. This paper had garnered much attention upon publi- cation, but somehow did not succeed in retaining it. As we are faced with the development of networks which will carry cells in an asynchronous manner, the information content of pro- tocols becomes increasingly relevant.
This broad area is clearly a rich source of hard problems for information theorists. Several studies are at present under way but the entire effort could do with some coherence.
The question that was put to us is what things should infor- ma- tion theory be studying and what things should we not be study- ing. I was thinking of the fact that we have a large number of mathematicians or mathematically trained people in our group which leads to a positive result and also a nega- tive result. Both of which I would like to mention.
First of all, of all the engineering disciplines and even in the physics community, I think that we are the most rigorous and the most precise. When we say something is true, then it is true. We do not confuse proof with an opinion. And I think that is good. On the other hand, I notice that we sometimes become trapped on a problem when the problem is long since solved we continue to work on it and write papers. I mentioned Peter Elias before. I think Peter had a fictitous title of a paper he mentioned now and then whose title was something like "Detection of a Trianglularly Modulated Wave in the Presence of Two Sine Waves and Additive Gaussian Noise". In other words, a very artificial problem of no particular interest, but one that you could write a long series of papers.
As another negative example I might also think of decoding algorithms for a particular error correct- ing code, or a decoding algorithm for which there is only one code with which it can be used. We sometimes see papers like that too. Those are the kinds of papers we should not work on. Unless the reason we are working on those particular papers is that we are trying to open a door to understanding and by working on a small problem we might open a door into a whole wide area and learn new things. Then it is a legitimate problem to work on.
The reason I am mentioning that is because the question put to me by Sergio is what kinds of things I see that we should not be working on. I do not think we should be working on very arcane problems with no particular relevance to anything, just for the sake of having a topic to work on. There are other things that I think that we should work on that we are not, I would like to see some of us moving closer to problems of physics and computation.
In the case of communications we now understand very well the interaction between energy and communications. We have communication waveforms now that can transmit bits with energies measured in sub-micro-microjoules per bit. Meanwhile, in the area of computer science or computer architecture, I should say, or physics of com- putation, people are studying ways of doing a computation, how could one build a computer that does a million or billion word operations per second. How can we build one that minimizes the amount of energy expended?
There are conferences on this very subject right now. Conferences with names like the Physical Limits of Computation and people claiming that by ar- ranging gates in such and such a way or by doing this or that they can minimize the amount of energy a computer uses. Because after all in the area of computation now, the biggest issue in reducing the size or increasing the performance is reducing the amount of energy expended. Most computers are so small that they melt themselves or they are on the verge of melting themselves and any reduction in size or increase in the speed of the computer fails because the computer melts before it can do anything useful. So there is this question about how can we think about computation in a way that reduces the amount of energy expended and we in information theory, more than anyone else know how to organize communication waveform, and communication systems, so that we reduce the amount of energy expended. I think that we should have something more to say about the problem of computing in noise with minimum energy per gate operation, but we never interact with those people and I think it is a legitimate area.
Second, efforts will continue on some of the classical open problems in the field. For example, the capacity of the interference channel, the simplex conjecture, the capacity of the broadcast channel, etc. Despite intense efforts dur- ing the last forty years, the reliability function is still an open problem for even the binary symmetric channel. The problem of zero-error capacity is another longstanding problem which may or may not ever yield to solution.
And third, I foresee a lot of work on "postmodern" (postmo- dem?) information theory: Using the language of the classical theory to deal with problems outside the realm of conventional communication models. One example is information theory in the stock market, championed by Tom Cover. Will Wall Street embrace it now that they are getting serious about math? But, mainly, I expect a lot of action in the interface between in- formation theory and theoretical computer science. This includes problems on the complexity of interactive communication, the complexity of random process simulation, and various tradeoffs of randomness and complex- ity that we see in an increasing number of papers in theoretical computer science conferences such as STOC and FOCS.
Finally, I will conclude by saying that more than by its mathematical tools or its applications, Information Theory can be characterized by a way of thinking that combines in- tellec- tual rigor and engineering insights. That has been the tradi- tion in our field and it is our duty to ensure that that tradi- tion continues.
In addition to what has already been said, information theory will play an increasingly important role in the study of complexity versus performance. Complexity can be divided into (at least) model complexity and computational complexi- ty. The first is the complexity of describing the model for a paradigm and the second is the complexity of the algorithm or scheme that performs the specified task. A well-known ex- ample: Consider model-based universal data compression. It can be reasonably hoped that a more complex model will provide a better description of the data to be compressed. On the other hand, a more complex model will typically involve more complicated calculations and procedures. Perhaps the resulting compression will be better, perhaps not. In gen- eral, there is a tradeoff between modeling and computational complexities on the one hand and performance on the other -- a favorite sermon of my col- league, the one and only J.S. Baras. In this context, I recom- mend a rather recent book by Li and Vitanyi as enjoyable reading.
Also, the interplay between information theory and statis- tics will garner new attention. This hope springs from a recent series of exciting interactions between researchers from both communities covering a wide range of topics including data compression, complexity, Markov fields, large deviations, non- parametrics and sampling and wavelets, etc.