The problem of segmenting document pages into homogeneous regions
containing unique semantic entities is of prime importance in
automatic document understanding systems. Several algorithms exist
that recognize printed or handwritten text. However, most of these
algorithms assume that the input is a plain text and the text lines
and words in the text have been properly identified and segmented
by a preprocessor. A typical handwritten document page may contain
several regions of interest such as underlined keywords, different
types of tables, diagrams, sketches and text. In this talk, I will
present a hierarchical approach for extracting homogeneous regions
in on-line documents. The problem of identifying and processing
ruled and unruled tables, plain text and drawings is addressed. Reinforcement Learning (RL) is a general computational framework for
solving sequential decision-making tasks under uncertainty in action
and perception. It encompasses a broad range of methods for determining
optimal ways of behaving in complex stochastic environments. Hierarchical
approaches study how to scale RL to large domains by reusing learned
subtasks or behaviors as primitive actions at the next level. Abstract
actions violate the Markov property, and require the use of semi-Markov
decision processes. Several alternative frameworks for hierarchical RL
have been proposed, including "options", "HAMs" and "MAXQ". We focus on
the "MAXQ" framework in this talk.
We extend the MAXQ framework to cooperative multi-agent tasks. Learning
is decentralized, with each agent learning three interrelated skills:
how to perform subtasks, which order to do them in, and how to coordinate
with other agents. Coordination skills among agents are learned by using
joint actions at the highest level(s) of the hierarchy.
We also describe how to generalize the MAXQ framework to continuous-time
and average reward approaches. We propose two continuous-time hierarchical
reinforcement learning algorithms based on MAXQ, "continuous-time discounted
reward MAXQ" and continuous-time average reward MAXQ". We also show the
performance and speed of these continuous-time algorithms in a complex
multi-agent AGV factory scheduling problem, and compare it with the
discrete-time MAXQ algorithms as well as several well-known AGV heuristics. In our everyday life, our brain is constantly planning and executing
concurrent (parallel) behaviors. For example, when we are driving, in
parallel we visually search for road signs, while we may be talking to
a passenger. Or when we are walking toward our car in a parking lot or
towards our office, we may simultaneously reach for our car keys,
while talking on the cell phone and avoiding obstacles. Parallel
execution of behaviors is sometimes useful in speeding up a task; in
other situations, the nature of the task requires that multiple
behaviors run concurrently and cooperatively in order to perform the
task. In this talk, we propose and investigate a new model for
learning and planning with concurrent behaviors. We adopt the
theoretical framework of "options" to model temporally extended
courses of action. We illustrate the options framework with an example
of the Acrobat problem, a difficult nonlinear robot arm swing-up
task. We also present a navigation task involving moving through rooms
using keys to open locked doors to illustrate how the concurrent
behavior model is used for planning. Our experiments show that the
concurrent behavior model improves the performance when compared to
the case when only one behavior at a time can be executed. Fingerprint based personal identification has been used in forensic labs
and identification units around the world for a very long time and has
been accepted in the court of law for nearly a century in the United
States. Fingerprint identification relies on the premise that the
fingerprints of every finger of a person are unique. This premise is
based on the observation of millions of fingerprints but the underlying
statistical basis of the individuality of fingerprints has not been
rigorously studied. To strengthen the scientific foundation of fingerprint
identification, the fingerprint individuality problem bas been formulated
in forensic science literature as follows: (i) measure the amount of
detail in a single fingerprint that is available for comparison, and (ii)
measure the amount of detail in correspondence between two fingerprints.
Fingerprint individuality is an important problem because a scientific
basis (reliable statistical estimate of the error) for fingerprint
comparison can not only determine the admissibility of fingerprint
identification in the court of law as evidence of identity, but it can
also establish an upper bound on the performance of automatic fingerprint
verification systems. As a result of recent court challenges that question
the scientific foundation of "friction ridge identification", there is a
grouwing interest in a statistical validation of individuality of
fingerprints. In this talk, I will talk about the fingerprint
individuality problem and how it relates to an upper bound on the
performance of automatic fingerprint verification systems. In this talk I will outline a novel approach to the analysis and matching of
shapes. Along with features such as color and texture, shape is essential to
the task of visual object recognition. In our approach, we achieve flexible
shape representation by stochastic sampling of contours and by attaching a
particularly rich descriptor, the "shape context," to each point. The shape
context captures the distribution of shape points relative to the reference
point and thus offers a globally discriminative characterization for each
shape point.
The proposed shape descriptor allows for a highly effective procedure that
recovers shape correspondences by employing a weighted bipartite matching
procedure. An established point correspondence then allows us to recover the
optimal transformation between shapes. Regularized thin-plate splines
provide us with a flexible class of transformation maps for this purpose.
Finally, we treat shape similarity and shape recognition in some detail.
Results are presented for silhouettes, trademarks, handwritten digits, and
household objects. Microbes stained with special fluorescein and chemicals exhibit different
colors. The color indicates the state of the microbes during an important
activity like metabolism or fission. The morphtype, biovolumn, and
aggregate situation of those microbes in some special colors are important
for further biological analysis. The color image itself has a noisy
background and a complex foreground. Because the amount of microbes in an
image is large, a lot of time is needed to manually extract information
from the image. The goal of this work is to identify the microbes present
in the noisy image and to extract useful information about them. The
proposed system identifies the microbes in a digitized color image using
an interactive user interface. There is no fixed color range for a
specific color stain; microbes with the same reaction to the chemical
might show different but related colors. Color segmentation (with local
thresholding and region growing) is applied to eliminate the background
noise and select the region of interest. The microbes present in the
region of interest might be connected or even overlap with each other. The
connection and overlapping problems distort the extracted information: the
microbe count will be incorrect and the biovolumn of microbes will be
underestimated. The morphtype and color information are used to split the
connected microbes. To distinguish between overlapping microbes and to
count the number of microbes correctly, further morphtype analysis is
required. The Open Mind Initiative is a framework
for collecting data contributed over the internet by large numbers of
non-expert web users. The Initiative's creation was based on the
realization that for many problem areas (e.g., optical character
recognition, speech recognition, ...) there is highly developed and
adequate theory; progress is held back by lack of sufficiently large
datasets of 'informal' knowledge. In fact, sufficiently large datasets can
"swamp the prior information" inherent in the choice of classifier model,
and thus research and development in pattern recognition is turning away
from minor tweaking of algorithms and toward new methods of collecting and
"truthing" large datasets. We contrast the Open Mind Initiative with
traditional data mining, and discuss recent research in theory of data
acquisition, sampling and interactive learning as employed by the
Initiative. We conclude with a number of unsolved problems in data
collection and new research directions in pattern classification. This presentation presents a new method of incremental principle
component analysis (IPCA) used as a tool to develop orientation selection
filters for early sensory processing in the visual cortex. The method also
attempts to show an implementation with properties similar to those found in
the human visual pathway. In order to keep the method computationally
efficient, the lateral inhibition of sensory neurons is modeled through the
use of residual image. The use of staggering receptive field s models the
random positioning of those found in retina. Furthermore, a multi-layered
neural network using population coding and IPCA develops complex filters
with varying receptive field size, with properties somewhat to those of
simple and complex cells found in the human visual cortex.Anoop M Namboodiri: Structure in On-line
Documents
Khashayar Rohanimanesh: Learning and Planning with Concurrent Behaviors
Salil Prabhakar: Fingerprint Individuality
Dr. Serge Belongie: Matching Shapes
Feng-I Liu: Assisted Identification
in Digital Color Microbial Images
Dr. Sarat Dass: Markov Face Models
Nan Zhang: IPCA Network: A Neural Vision Computational Model