In Search of a Unifying Theory for Image Interpretation
Department of Applied Mathematics and Statistics
Center for Imaging Science
The Johns Hopkins University
Image interpretation, which is effortless and instantaneous for human beings, is the grand challenge of computer vision. The dream is to build a "description machine" which produces a rich semantic description of the underlying scene, including the names and poses of the objects that are present, even "recognizing" other things, such as actions and context. Accuracy entails invariance to transformations which preserve semantic labels. Also, such a machine should learn and run efficiently. Many would argue that we are at best inching towards this goal; certainly algorithms and strategies abound for sub-tasks and some generic tools have been honed. All of this will be amply illustrated in the first part of the talk. But what about a unifying theory, such as the one Shannon provided for communication -- one which achieves abstraction and still guides the development of practical algorithms? Might this accelerate progress? Mathematical frameworks are advanced from time to time, but none is yet widely accepted, and none clearly points the way to closing the gap with biological vision. There might be some lessons from information theory, for example thinking about representation and processing at the same time. I will argue that hierarchical structures should be at the core of such a theory and support this claim by summarizing several theoretical results about the efficiency of hierarchical search and representation.
Donald Geman received the BA degree in English literature from the University of Illinois in 1965 and the PhD degree in mathematics from Northwestern University in 1970. He joined the Department of Mathematics and Statistics at the University of Massachusetts-Amherst in 1970, where he became a "Distinguished University Professor." In 2001, he moved to The Johns Hopkins University, where he is currently a professor in the Department of Applied Mathematics and Statistics and a member of the Center for Imaging Science in the Whitaker Institute. Visiting appointments include those at the University of North Carolina (1976-1977), Brown University (1991-1992), Ecole Polytechnique (1997-1999), and Ecole Normale Superieure-Cachan (2000-2003). His current research interests include computational vision, statistical and sequential learning, and bioinformatics.