q2 -- December94

Verd'u

Let us talk about the current trends in Shannon theory. What kind of problems should we put more (and less) emphasis on? [The tape recorder missed a portion of the discussion, includ- ing Dave Forney's answer to this question].

Csisz'ar

I have heard saying that information theory is the study of the convexity properties of the logarithmic function. I do not subscribe to this view, however. To be serious, first I would like to specify what I mean by Shannon Theory. Of course, this is not a good name because Shannon initiated the whole informa- tion theory. Still, somehow it became tradition, at the IEEE Transactions anyhow, to use the term Shannon Theory to denote one particular area of information theory. In my opinion, this area is basically that field where the information measures play a central role.

Certainly, one of the main discoveries of Shannon was that the amount of information can be measured without reference to the meaning of the information. Therefore it is perhaps not too wrong to call Shannon Theory that field, all those fields, where those information measures play an important role. In this sense coding theorems for sources, for channels and multiuser networks certainly fall into this field, but so do applications of information theory outside communication like MDL or maximum entropy techniques in statis- tics, or applications to large deviation theory in probability theory, or to the theory of Gibbs fields.

Some people might find this definition too broad, but at least that is a possible point of view. Of course I do not think it is a very basic question what kind of terminology we adopt. It is more important what kind of problems we are doing. Now the next part of the question would be what are the problems which we consider important and what failures do we see. Again for me this means something else than probably for the majority. I am not very familiar with the applied field. I was not aware that now we are able to achieve channel capacity by algorithmic methods in some very real situations. But I am happy to learn this fact.

From the purely mathematical point of view, there are many problems, very basic problems, which could not be solved so far. I do not think that is a failure, it is probably due to the very difficult nature of those problems. For example, a fundamental problem of coding theory is to determine the maximum number of codewords with a prescribed minimum distance, at least asymptotically. This very basic problem appears extremely hard to solve, even though, perhaps, the algebraic geometry codes provide a tool to get closer to the solution. There are basic problems about zero error capacity. Nobody has any means of computing the zero error capacity of a channel even in principle, except for special cases, although zero error capacity is a more basic concept than ordinary capacity.

Clearly, one would be more interested in transmitting information without any errors than with a small probability of error. In certain devices it is impossible, but when it is possible to transmit without errors, one would like to know what rates are achievable. Many people have worked on this problem and the answers are still far away. It often helps to give new formulations and design new problems which perhaps look more general and more difficult. Sometimes the more general problem will be less difficult to solve. I think it is important to work even on such questions which may look hopeless to solve completely. Small steps forwards may also be valuable.

For example, I just mention one problem on which I have worked recently, that is the problem of channel capacity for a specified decoding metric. There is no hope or very little hope to solve this problem completely because this would include the solution of the zero error capacity problem. That would be a great achievement, but even if this cannot be done, any partial results may bring us closer to understand the difficulties or the workings of such mathematical problems. Of course there are many, many difficult open problems in multiuser information theory. This morning we have heard a talk in which the solution of one of those problems was announced, and I would be very happy if this solution would turn out to be correct. There are plenty of problems in statistics, in combinatorics which have an information theoretical flavor. An appropriate formulation often leads to good answers. Very often we cannot answer problems because we are not able to formulate them properly.

I would like to emphasize, I do not consider information theory a part of communication theory. It is much broader. Communication is just one field of applications. But there are other fields, statistics, physics, biology, genetics (as we saw today) where information measures can turn out to be useful. So, I think it is important to view the field in this generality.

Pinsker

I would like to mention two current problems: First, arbitrarily varying channels. It is now known that for the AVC the capacity can be attained by list decoding with fixed size. Related problems are to find the minimal size of such lists and to construct list codes. For example, to find some kind of Viterbi decoding. Also the investigation of the probability of error is of importance here. A second problem is to find algo- rithms similar to Lempel-Ziv's in order to encode images.

Verd'u

I agree with Imre that communication should not be the only driving force behind information theory. In a sense I view in- formation theory as physics not only because we always carry units around (our logarithms) but because information theory is driven by the real world. Let it be communication, statistics, genetics or the stock market. I like to work on models that are practically relevant, but yet simple enough to admit solutions from which useful lessons can be learned. Others like to work on problems for their own sake--whether they have any relevance to practice or not, and that is a respectable approach.

I sense a "rennaisance of noise" in our field. Some time ago, some of our best minds moved away from physical-channel research, which at some point may have been viewed by some as old- fashioned. I think the pendulum is swinging back under the influence of, above all, wireless communications.

Wireless has brought multiuser information theory to the forefront once again. Lately, we have heard very heated arguments as to whether CDMA has higher capacity than TDMA and so on. Those arguments typically involve a dose of information theory mixed with some quasi-religious fervor. (Most of us in multiuser information theory are partial to CDMA, and not just be- cause it keeps us busier than TDMA or FDMA.) But we should not blame the practitioners for the lack of sophistication of some of their arguments. By and large, the development of mul- tiuser information theory did not capture until recently some of the essential issues that must be faced in real CDMA channels. The Cover-Wyner region was really all that was known for Gaussian multiple access chan- nels and that simple model did not come close to modeling CDMA. A widely-held misconception about the capacity region of more general multiaccess channels existed until Gallager came up with the right answer in 1988. Since then, the capacity of various CDMA and cellular channels has been found.

Networks is, I think, another case where the expectations have not materialized. In the late Seventies- early Eighties, there was a lot of hope for "network information theory". Today there are results that apply to some of the building blocks in wireless networks but there really is no such thing as network information theory. Perhaps there will never be.

I sense a decreasing reliance on combinatorics in information theory. We often admire (or become thoroughly frustrated with) the virtuoso combinatorial proofs for some very deep results within very specific classes of discrete memoryless channels. But, I would go as far as saying that none of the flagship results in information theory are real- ly combinatorial in nature. Perhaps, nowadays, the proba- bilistic school that does not rely on combinatorics is gain- ing ground. And that is salutary: sometimes in order to get to the heart of a result it is better to actually put it in a general framework of general channels (no just discrete memoryless channels) and sometimes that even leads to simpler proofs because you have fewer tools.

Narayan

The increasing orientation of the traditional communication en- gineer towards communication networks sends an unmistak- able signal. At the risk of harping on the same tune, I be- lieve that information theorists must strive to make significant contributions to this rapidly emerging field.

There are several questions crying out aloud for answers. For instance, high-speed networks will carry a combination of vast amounts of voice, data and video. This information must be compressed either losslessly or lossily in an effi- cient manner which is in conformity with network protocols. What are good mathematical models for such sources? Are statistics-based compression schemes the best choice? How do we characterize the fundamental limits of compression for such sources?

Another class of problems arises in wireless. Given that we're all expected to ultimately have cellular units glued to our persons, we surely have a personal stake in this area! I am told that the compound channel is too optimistic a model for the wireless link whereas the arbitrarily varying channel is too pessimistic. Is there then a better model with a capacity which can be evaluated? What happens to this channel in a network context? What are good coding and modulation formats for this application? The CDMA/TDMA debate has already been alluded to.

Next, there is a growing recognition of the fact that satellites will play a key role in networks which heretofore were considered to be largely terrestrial. The introduction of satellites raises several interesting issues, one of them being multicasting. Simply put, multicasting involves the transmission of the same information to many receivers with different reception capabilities. From an information theoretic standpoint, the problem is one of joint source- channel coding, with the source coding part being hierarchical in nature. There is also a (hierarchical) source coding problem with constraints being imposed by allowable network link rates. Yet another issue concerns the use of the satellite link as an alternative to terrestrial routes, e.g., when the latter are congested or fail. The inherent ``asymmetry" of the satellite link (owing to propagation delays and reduced speeds) in an otherwise terrestrial network raises a host of interesting problems.

Sergio has pointed out that multi-user information theory has not quite kept up with the development of networks. I do not know whether this is due to the fact that the problems, as formulated by us, are just too difficult, or whether new paradigms are needed. On a slightly different topic, Gallager -- in the late seventies -- published a paper on the information content of protocols. This paper had garnered much attention upon publi- cation, but somehow did not succeed in retaining it. As we are faced with the development of networks which will carry cells in an asynchronous manner, the information content of pro- tocols becomes increasingly relevant.

This broad area is clearly a rich source of hard problems for information theorists. Several studies are at present under way but the entire effort could do with some coherence.

Blahut

First of all, I have to say a few words to my good friend, Dave Forney. This is a panel discussion so I suppose it is ok to have a debate and Dave did make the statement that convolutional codes are much more widely used than block codes. And I am usually not interested in taking one side or the other on this debate which is often heard and often silly. It is true, I will agree with Dave, that in applications to the additive white gaussian noise channel, convolutional codes are much more widely used than block codes. But if you look at all the channels that are out there in the world, then I think, by far it is true that block codes are much more widely used than convo- lutional codes. I was especially thinking of the compact disc player of which there are more of, by far, than there are modems. I am thinking of magnetic tape in disc drives and optical disc drives, all of which are using block codes. The closed caption annotation on American TV systems uses a block code to annotate TV programs for the hearing impaired. I know of applications to bar codes that use block codes. So if it is just a sheer matter of counting how many encoders and decoders there are, block codes are way out in front. One might also make the point that the additive white gaussian noise channel occurs much less frequently than other kinds of channels. Nonadditive nongaussian noise channels are everywhere, including storage, optimal communications, jamming, and multipath channels. Those are more difficult to understand and to model.

The question that was put to us is what things should infor- ma- tion theory be studying and what things should we not be study- ing. I was thinking of the fact that we have a large number of mathematicians or mathematically trained people in our group which leads to a positive result and also a nega- tive result. Both of which I would like to mention.

First of all, of all the engineering disciplines and even in the physics community, I think that we are the most rigorous and the most precise. When we say something is true, then it is true. We do not confuse proof with an opinion. And I think that is good. On the other hand, I notice that we sometimes become trapped on a problem when the problem is long since solved we continue to work on it and write papers. I mentioned Peter Elias before. I think Peter had a fictitous title of a paper he mentioned now and then whose title was something like "Detection of a Trianglularly Modulated Wave in the Presence of Two Sine Waves and Additive Gaussian Noise". In other words, a very artificial problem of no particular interest, but one that you could write a long series of papers.

As another negative example I might also think of decoding algorithms for a particular error correct- ing code, or a decoding algorithm for which there is only one code with which it can be used. We sometimes see papers like that too. Those are the kinds of papers we should not work on. Unless the reason we are working on those particular papers is that we are trying to open a door to understanding and by working on a small problem we might open a door into a whole wide area and learn new things. Then it is a legitimate problem to work on.

The reason I am mentioning that is because the question put to me by Sergio is what kinds of things I see that we should not be working on. I do not think we should be working on very arcane problems with no particular relevance to anything, just for the sake of having a topic to work on. There are other things that I think that we should work on that we are not, I would like to see some of us moving closer to problems of physics and computation.

In the case of communications we now understand very well the interaction between energy and communications. We have communication waveforms now that can transmit bits with energies measured in sub-micro-microjoules per bit. Meanwhile, in the area of computer science or computer architecture, I should say, or physics of com- putation, people are studying ways of doing a computation, how could one build a computer that does a million or billion word operations per second. How can we build one that minimizes the amount of energy expended?

There are conferences on this very subject right now. Conferences with names like the Physical Limits of Computation and people claiming that by ar- ranging gates in such and such a way or by doing this or that they can minimize the amount of energy a computer uses. Because after all in the area of computation now, the biggest issue in reducing the size or increasing the performance is reducing the amount of energy expended. Most computers are so small that they melt themselves or they are on the verge of melting themselves and any reduction in size or increase in the speed of the computer fails because the computer melts before it can do anything useful. So there is this question about how can we think about computation in a way that reduces the amount of energy expended and we in information theory, more than anyone else know how to organize communication waveform, and communication systems, so that we reduce the amount of energy expended. I think that we should have something more to say about the problem of computing in noise with minimum energy per gate operation, but we never interact with those people and I think it is a legitimate area.