APE Weekly Seminar Abstracts - Spring 2001

Anoop M Namboodiri: Structure in On-line Documents

The problem of segmenting document pages into homogeneous regions containing unique semantic entities is of prime importance in automatic document understanding systems. Several algorithms exist that recognize printed or handwritten text. However, most of these algorithms assume that the input is a plain text and the text lines and words in the text have been properly identified and segmented by a preprocessor. A typical handwritten document page may contain several regions of interest such as underlined keywords, different types of tables, diagrams, sketches and text. In this talk, I will present a hierarchical approach for extracting homogeneous regions in on-line documents. The problem of identifying and processing ruled and unruled tables, plain text and drawings is addressed.

Mohammad Ghavam_Zadeh: Hierarchical Multi-Agent Reinforcement Learning: Discrete and Continuous-time Algorithms

Reinforcement Learning (RL) is a general computational framework for solving sequential decision-making tasks under uncertainty in action and perception. It encompasses a broad range of methods for determining optimal ways of behaving in complex stochastic environments. Hierarchical approaches study how to scale RL to large domains by reusing learned subtasks or behaviors as primitive actions at the next level. Abstract actions violate the Markov property, and require the use of semi-Markov decision processes. Several alternative frameworks for hierarchical RL have been proposed, including "options", "HAMs" and "MAXQ". We focus on the "MAXQ" framework in this talk. We extend the MAXQ framework to cooperative multi-agent tasks. Learning is decentralized, with each agent learning three interrelated skills: how to perform subtasks, which order to do them in, and how to coordinate with other agents. Coordination skills among agents are learned by using joint actions at the highest level(s) of the hierarchy. We also describe how to generalize the MAXQ framework to continuous-time and average reward approaches. We propose two continuous-time hierarchical reinforcement learning algorithms based on MAXQ, "continuous-time discounted reward MAXQ" and continuous-time average reward MAXQ". We also show the performance and speed of these continuous-time algorithms in a complex multi-agent AGV factory scheduling problem, and compare it with the discrete-time MAXQ algorithms as well as several well-known AGV heuristics.

Khashayar Rohanimanesh: Learning and Planning with Concurrent Behaviors

In our everyday life, our brain is constantly planning and executing concurrent (parallel) behaviors. For example, when we are driving, in parallel we visually search for road signs, while we may be talking to a passenger. Or when we are walking toward our car in a parking lot or towards our office, we may simultaneously reach for our car keys, while talking on the cell phone and avoiding obstacles. Parallel execution of behaviors is sometimes useful in speeding up a task; in other situations, the nature of the task requires that multiple behaviors run concurrently and cooperatively in order to perform the task. In this talk, we propose and investigate a new model for learning and planning with concurrent behaviors. We adopt the theoretical framework of "options" to model temporally extended courses of action. We illustrate the options framework with an example of the Acrobat problem, a difficult nonlinear robot arm swing-up task. We also present a navigation task involving moving through rooms using keys to open locked doors to illustrate how the concurrent behavior model is used for planning. Our experiments show that the concurrent behavior model improves the performance when compared to the case when only one behavior at a time can be executed.

Salil Prabhakar: Fingerprint Individuality

Fingerprint based personal identification has been used in forensic labs and identification units around the world for a very long time and has been accepted in the court of law for nearly a century in the United States. Fingerprint identification relies on the premise that the fingerprints of every finger of a person are unique. This premise is based on the observation of millions of fingerprints but the underlying statistical basis of the individuality of fingerprints has not been rigorously studied. To strengthen the scientific foundation of fingerprint identification, the fingerprint individuality problem bas been formulated in forensic science literature as follows: (i) measure the amount of detail in a single fingerprint that is available for comparison, and (ii) measure the amount of detail in correspondence between two fingerprints. Fingerprint individuality is an important problem because a scientific basis (reliable statistical estimate of the error) for fingerprint comparison can not only determine the admissibility of fingerprint identification in the court of law as evidence of identity, but it can also establish an upper bound on the performance of automatic fingerprint verification systems. As a result of recent court challenges that question the scientific foundation of "friction ridge identification", there is a grouwing interest in a statistical validation of individuality of fingerprints. In this talk, I will talk about the fingerprint individuality problem and how it relates to an upper bound on the performance of automatic fingerprint verification systems.

Dr. Serge Belongie: Matching Shapes

In this talk I will outline a novel approach to the analysis and matching of shapes. Along with features such as color and texture, shape is essential to the task of visual object recognition. In our approach, we achieve flexible shape representation by stochastic sampling of contours and by attaching a particularly rich descriptor, the "shape context," to each point. The shape context captures the distribution of shape points relative to the reference point and thus offers a globally discriminative characterization for each shape point. The proposed shape descriptor allows for a highly effective procedure that recovers shape correspondences by employing a weighted bipartite matching procedure. An established point correspondence then allows us to recover the optimal transformation between shapes. Regularized thin-plate splines provide us with a flexible class of transformation maps for this purpose. Finally, we treat shape similarity and shape recognition in some detail. Results are presented for silhouettes, trademarks, handwritten digits, and household objects.

Feng-I Liu: Assisted Identification in Digital Color Microbial Images

Microbes stained with special fluorescein and chemicals exhibit different colors. The color indicates the state of the microbes during an important activity like metabolism or fission. The morphtype, biovolumn, and aggregate situation of those microbes in some special colors are important for further biological analysis. The color image itself has a noisy background and a complex foreground. Because the amount of microbes in an image is large, a lot of time is needed to manually extract information from the image. The goal of this work is to identify the microbes present in the noisy image and to extract useful information about them. The proposed system identifies the microbes in a digitized color image using an interactive user interface. There is no fixed color range for a specific color stain; microbes with the same reaction to the chemical might show different but related colors. Color segmentation (with local thresholding and region growing) is applied to eliminate the background noise and select the region of interest. The microbes present in the region of interest might be connected or even overlap with each other. The connection and overlapping problems distort the extracted information: the microbe count will be incorrect and the biovolumn of microbes will be underestimated. The morphtype and color information are used to split the connected microbes. To distinguish between overlapping microbes and to count the number of microbes correctly, further morphtype analysis is required.

Dr. David G. Stork: The Open Mind Initiative: Contributing data over the internet for training "intelligent" systems

The Open Mind Initiative is a framework for collecting data contributed over the internet by large numbers of non-expert web users. The Initiative's creation was based on the realization that for many problem areas (e.g., optical character recognition, speech recognition, ...) there is highly developed and adequate theory; progress is held back by lack of sufficiently large datasets of 'informal' knowledge. In fact, sufficiently large datasets can "swamp the prior information" inherent in the choice of classifier model, and thus research and development in pattern recognition is turning away from minor tweaking of algorithms and toward new methods of collecting and "truthing" large datasets. We contrast the Open Mind Initiative with traditional data mining, and discuss recent research in theory of data acquisition, sampling and interactive learning as employed by the Initiative. We conclude with a number of unsolved problems in data collection and new research directions in pattern classification.

Dr. Sarat Dass: Markov Face Models

Nan Zhang: IPCA Network: A Neural Vision Computational Model

This presentation presents a new method of incremental principle component analysis (IPCA) used as a tool to develop orientation selection filters for early sensory processing in the visual cortex. The method also attempts to show an implementation with properties similar to those found in the human visual pathway. In order to keep the method computationally efficient, the lateral inhibition of sensory neurons is modeled through the use of residual image. The use of staggering receptive field s models the random positioning of those found in retina. Furthermore, a multi-layered neural network using population coding and IPCA develops complex filters with varying receptive field size, with properties somewhat to those of simple and complex cells found in the human visual cortex.