Information and Entropy in Biological Systems Workshop

Information and Entropy in Biological Systems Workshop
Ever since Shannon initiated research on information theory in 1948, there have been hopes that the concept of information could serve as a tool to help systematize and unify work in biology. The link between information and entropy was noted very early on, and it suggested that a full thermodynamic understanding of biology would naturally involve the information processing and storage that are characteristic of living organisms. However, the subject is full of conceptual pitfalls for the unwary, and progress has been slower than initially expected. Premature attempts at ‘grand syntheses’ have often misfired. But applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful, such as:
• the maximum entropy principle in ecology,
• Shannon and Rényi entropies as measures of biodiversity,
• information theory in evolutionary game theory,
• information and the thermodynamics of individual cells.
Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far. The aim of the workshop is to do just this.
nimbios_logo_lg

In what follows, participants’ names are in boldface, while the main goals of the workshop are in italics.

Roderick Dewar  is a key advocate of the principle of Maximum Entropy Production, which says that biological systems—and indeed all open, non-equilibrium systems—act to produce entropy at the maximum rate. Along with others, he has applied this principle to make testable predictions in a wide range of biological systems, from ATP synthesis [DJZ2006] to respiration and photosynthesis of individual plants [D2010] and plant communities. He has also sought to derive this principle from ideas in statistical mechanics [D2004, D2009], but it remains controversial.

The first goal of this workshop is to study the validity of this principle .

While they may be related, the principle of Maximum Entropy Production should not be confused with the MaxEnt inference procedure, which says that we should choose the probabilistic hypothesis with the highest entropy subject to the constraints provided by our data. MaxEnt was first explicitly advocated by Jaynes. He noted that it is already implicit in the procedures of statistical mechanics, but convincingly argued that it can also be applied to situations where entropy is more ‘informational’ than ‘thermodynamic’ in character.

Recently  John Harte  has applied MaxEnt in this way to ecology, using it to make specific testable predictions for the distribution, abundance and energy usage of species across spatial scales and across habitats and taxonomic groups [Harte2008, Harte2009, Harte2011].  Annette Ostling  is an expert on other theories that attempt to explain the same data, such as the ‘neutral model’ [AOE2008, ODLSG2009, O2005, O2012].  Dewar  has also used MaxEnt in ecology [D2008], and he has argued that it underlies the principle of Maximum Entropy Production.

Thus, a second goal of this workshop is to familiarize all the participants with applications of the MaxEnt method to ecology, compare it with competing approaches, and study whether MaxEnt provides a sufficient justification for the principle of Maximum Entropy Production.

Entropy is not merely a predictive tool in ecology: it is also widely used as a measure of biodiversity. Here Shannon’s original concept of entropy naturally generalizes to ‘Rényi entropy’, which depends on a parameter  \alpha \ge 0 . This equals

\displaystyle{ H_\alpha(p) = \frac{1}{1-\alpha} \log \sum_i p_i^\alpha }

where  p_i  is the fraction of organisms of the  i th type (which could mean species, some other taxon, etc.). In the limit  \alpha \to 1  this reduces to the Shannon entropy:

\displaystyle{ H(p) = - \sum_i p_i \log p_i }

As  \alpha  increases, we give less weight to rare types of organisms. Christina Cobbold  and  Tom Leinster  have described a systematic and highly flexible system of biodiversity measurement, with Rényi entropy at its heart [CL2012]. They consider both the case where all we have are the numbers  p_i , and the more subtle case where we take the distance between different types of organisms into account.

John Baez  has explained the role of Rényi entropy in thermodynamics [B2011], and together with  Tom Leinster  and  Tobias Fritz  he has proved other theorems characterizing entropy which explain its importance for information processing [BFL2011]. However, these ideas have not yet been connected to the widespread use of entropy in biodiversity studies. More importantly, the use of entropy as a measure of biodiversity has not been clearly connected to MaxEnt methods in ecology. Does the success of MaxEnt methods imply a tendency for ecosystems to maximize biodiversity subject to the constraints of resource availability? This seems surprising, but a more nuanced statement along these general lines might be correct.

So, a third goal of this workshop is to clarify relations between known characterizations of entropy, the use of entropy as a measure of biodiversity, and the use of MaxEnt methods in ecology.

As the amount of data to analyze in genomics continues to surpass the ability of humans to analyze it, we can expect automated experiment design to become ever more important. In  Chris Lee  and  Marc Harper ’s RoboMendel program [LH2013], a mathematically precise concept of ‘potential information’—how much information is left to learn—plays a crucial role in deciding what experiment to do next, given the data obtained so far. It will be useful for them to interact with  William Bialek , who has expertise in estimating entropy from empirical data and using it to constrain properties of models [BBS, BNS2001, BNS2002], and  Susanne Still , who applies information theory to automated theory building and biology [CES2010, PS2012].

However, there is another link between biology and potential information.  Harper  has noted that in an ecosystem where the population of each type of organism grows at a rate proportional to its fitness (which may depend on the fraction of organisms of each type), the quantity

\displaystyle{ I(q||p) = \sum_i q_i \ln(q_i/p_i) }

always decreases if there is an evolutionarily stable state [Harper2009]. Here  p_i  is the fraction of organisms of the  i th genotype at a given time, while  q_i  is this fraction in the evolutionarily stable state. This quantity is often called the Shannon information of  q  ‘relative to’  p . But in fact, it is precisely the same as  Lee  and Harper ’s potential information! Indeed, there is a precise mathematical analogy between evolutionary games and processes where a probabilistic hypothesis is refined by repeated experiments.

Thus, a fourth goal of this workshop is to develop the concept of evolutionary games as ‘learning’ processes in which information is gained over time.

We shall try to synthesize this with  Carl Bergstrom  and  Matina Donaldson-Matasci ’s work on the ‘fitness value of information’: a measure of how much increase in fitness a population can obtain per bit of extra information [BL2004, DBL2010, DM2013]. Following  Harper , we shall consider not only relative Shannon entropy, but also relative Rényi entropy, as a measure of information gain [Harper2011].

A fifth and final goal of this workshop is to study the interplay between information theory and the thermodynamics of individual cells and organelles.

Susanne Still  has studied the thermodynamics of prediction in biological systems [BCSS2012]. And in a celebrated related piece of work,  Jeremy England  used thermodynamic arguments to a derive a lower bound for the amount of entropy generated during a process of self-replication of a bacterial cell [England2013]. Interestingly, he showed that  E. coli  comes within a factor of 3 of this lower bound.

In short, information theory and entropy methods are becoming powerful tools in biology, from the level of individual cells, to whole ecosystems, to experimental design, model-building, and the measurement of biodiversity. The time is ripe for an investigative workshop that brings together experts from different fields and lets them share insights and methods and begin to tackle some of the big remaining questions.

More Information

Event Date
to Add to Calendar 2015-04-08 00:00:00 2015-04-10 00:00:00 Information and Entropy in Biological Systems Workshop Ever since Shannon initiated research on information theory in 1948, there have been hopes that the concept of information could serve as a tool to help systematize and unify work in biology. The link between information and entropy was noted very early on, and it suggested that a full thermodynamic understanding of biology would naturally involve the information processing and storage that are characteristic of living organisms. However, the subject is full of conceptual pitfalls for the unwary, and progress has been slower than initially expected. Premature attempts at ‘grand syntheses’ have often misfired. But applications of information theory and entropy to specific highly focused topics in biology have been increasingly successful, such as: • the maximum entropy principle in ecology, • Shannon and Rényi entropies as measures of biodiversity, • information theory in evolutionary game theory, • information and the thermodynamics of individual cells. Because they work in diverse fields, researchers in these specific topics have had little opportunity to trade insights and take stock of the progress so far. The aim of the workshop is to do just this. National Institute for Mathematical and Biological Synthesis, Knoxville Tennesee John Baez [email protected] America/New_York public
Event location
National Institute for Mathematical and Biological Synthesis, Knoxville Tennesee
Contact name