PRIP Weekly Seminar Abstracts - Fall 1999

paper discussion: Veggie Vision: A Produce Recognition System

Copy of the paper is available in PRIP lab, or contact Dr. Stockman.

paper discussion: Sensar Technologies Iris Recognition System

Copy of the paper is available in PRIP lab, or contact Dr. Stockman.

Erin Scott Mcgarrity: A Method for Calibrating See-through Head-mounted Displays for AR

In order to have a working AR system, the see-through system must be calibrated such that the internal models of objects match their physical counterparts. By match, we mean they should have the same position, orientation, and size information as well as any intrinsic parameters (such as focal lengths in the case of cameras) that their physical counterparts have. To this end, a procedure must be developed which estimates the parameters of these internal models. This calibration method must be both accurate and simple to use. This paper reports on our efforts to implement a calibration method for a see-through head-mounted display. We use a dynamic system in which a user interactively modifies the camera parameters until the image of a calibration object matches the image of a corresponding physical object. The calibration method is dynamic in the sense that we do not require the user's head to be immobilized.

Salil Prabhakar: Introduction to Support Vector Machine

Over this summer, I had the opportunity to work with the much talked about "Support Vector Machines" at IBM Almaden. I will talk about some of the work I did there. Because of intellectual property right issues, I will only give an overview of support vector machines and tell a little bit about the databases I worked on. I will try to keep away from much details and any numbers. The work was geared towards improving the accuracy of document and handwritten character classification. While SVMs could perform better than previous approaches, the time and storage complexity is prohibitive for large databases. Support Vector Machines have received a lot of attention in last few years. They have been shown to outperform several other classifiers in a number of domains. SVMs are based on the idea of finding such a linear decision boundary between samples from two classes that maximizes the margin between the two classes. The mathematically formulation is nice and the classifier is theoretically optimal. However, it has large memory and space requirements. Solving an SVM is equivalent to solving a Quadratic Programming problem which has prohibitive storage and time requirements for large number of samples. I will discuss a new algorithm for training SVMs, namely, SMO, developed by John Platt at Microsoft research. The overall optimization problem is divided into lots of small optimization problems and the solution of the smallest subdivision (two patterns) is found analytically. I will also touch upon how SVM can be generalized to handle multi-class problems and how kernel functions are used for finding nonlinear boundaries. I encourage you to take a look at "A Tutorial on Support Vector Machines for Pattern Recognition", C. J. C. Burges, Bell Laboratories, Lucent Technologies, Data Mining and Knowledge Discovery, Vol. 2, Number 2, p. 121-167, 1998

Rune Fisker: Training, Initialization and Optimization of Deformable Template Models

The deformable model literature has in general been very focused on the formulation and development of new models or the solution of a specific application. Training of conditional parameters, including weight parameters, and the final and crucial steps of initialization and optimization of the deformable model, needed for making inference, have received very little attention. During the talk I will review previous used methods and present the work I have been doing on these subjects. As a part of the talk I will be present the deformable model proposed by Grenander et al. A copy of a paper describing the model can be found in the PRIP lab.

Dr. Jiang Yu Zheng: Virtual Recovery of Excavated Relics

This work aims at a virtual recovery of excavated archaeological finds in cyberspace for ancient relic preservation, archaeology research, and multimedia contents generation. First, we develop an imaging device to digitize damaged pieces in form of 3-D shape and surface texture. Then we build an interface for connecting broken fragments in a virtual space so that the original model can be visually recovered. The idea of virtual recovery provides a new opportunity and flexibility for archaeologists to examine complex damaged relics. Moreover, the virtually recovered objects can be directly displayed in a multimedia format. Experiment has been made at an UNESCO world heritage in Xian, China.

Rein-Lien Hsu: Multiresolution Model Compression Using 3-D Wavelets

Three-dimensional (3-D) objects are often represented by geometric models in applications dealing with virtual reality, augmented reality, and cyberspace. Surface representations can provide an effective visualization of these objects. Polygonal models are the most prevalent type among surface representations. Recently, multiresolution representation (surface simplification) of polygonal models has been proposed to meet the requirements of easy manipulation, progressive transmission, effective visualization, and economical storage. In this talk, I will briefly introduce our proposed framework for multiresolution modeling, which is based on 3-D wavelet transforms. In this framework, we utilize a volumetric surface model that can be compressed simultaneously at multiple levels of detail (LODs). And a surface in 3-D space is treated as an extension of an edge in 2-D space. In addition, the techniques to further improve the compression efficiency of this framework are also addressed in the presentation: a lattice vector quantization technique and an arithmetic coding technique, both of which are applied to the compact wavelet coefficients.

Scott D. Connell: Online Handwriting Recognition Using Multiple Class Models

A writer independent handwriting recognition system must be able to recognize a wide variety of handwriting styles, while attempting to obtain a high degree of accuracy when recognizing data from any one of those styles. As the number of writing styles increases, so does the variability of the data's distribution. We then have an optimization problem: how to best model the data, while keeping the representation as simple as possible? If we can identify N different styles of writing individual characters (referred to as lexemes), these can then be modeled as N relatively simple independent distributions. In this talk, I discuss a method of automatically identifying lexemes, and present results using both non-parametric and parametric lexeme modeling methods. In addition, a new method of adapting models to better fit a single writer (writer-adapation) is described.

Yilu Zhang: Context-based decision-making for Living Machine

"Living machine" is a computational model that emulates the learning procedure of other biological systems. One of the capabilities it should possess is making decision on temporal context instead of a snap-shot of the real, dynamic world. In this talk, I am going to dicuss the framework and the mechanism that enables this capability. Some primitive experiments on synthetic and real speech data will show the effectiveness.

Arun Ross: Towards building a system that describes saccades over constrained images

In this talk we will examine the various factors controlling eye movement behavior during scene viewing. In particular, we will look into low-level visual information that affect the first few fixations of the eye as it regards a scene. Undertanding eye movements is essential for gaining insight into how humans acquire, represent and store information. Although it is clear that saccades are driven by the visual task presented to the subject, it has been suggested that the first few fixations are driven predominantly by visual features present in the image. Later fixations are driven by other cognitive features. We will also describe the saliency map framework - a model that attempts to mimic human eye movement during scene viewing.

Reference: Henderson, John M. and Andrew Hollingworth, "Eye Movements During Scene Viewing: An Overview", in "Eye Guidance in Reading and Scene Perception", G. Underwood (Ed), 1998.