Organizational meeting (call for speakers)

Date: September 5, 1997
Time: 2:00-3:00
Location: EB1535


Automatic Personal Identification by Integrating Faces and Fingerprints.

Speaker:
Lin Hong
Date: September 12, 1997
Time: 2:00-3:00
Location: EB1535

SHOSLIF with States for Vision-Based Navigation.

Speaker:
Shaoyun Chen
Date: September 19, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

An appearance-based navigation scheme with state feedback is presented. A prediction tree, the stochastic recursive partition tree (SRPT), is used to map preprocessed input raw image and current state into next state and output control signal. We use finite state machine to represent different navigation states. Simple visual attention is also embedded into the system. The SRPT can learn incrementally: an input pair is rejected or learned ``on-the- fly'' depending on whether the prediction tree performs well for current learning sample. Brief comparison with other state-of-the-art ANN-based navigation systems will be presented. The proposed scheme has been successfully applied to indoor navigation. Using a set of 420 learning samples of preprocessed raw input image with its associated state, our robot was able to navigate in a corridor loop with six turns of various types, several straight segments with different widths, and hallway doors.

An Improved Active Shape Model: Handling Occlusion and Outliers.

Speaker:
Nicolae Duta
Date: September 26, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

During the past several years, there have been numerous attempts to build models describing shape and appearance of non-rigid objects, and employ them for automated identification of such objects in the analyzed images. Among them, the Point Distribution Model representing the variation of a set of shapes around the average that was designed by Cootes and Taylor has many favorable shape representation properties. The goal of our work reported here is to improve the Active Shape procedure introduced by Cootes and Taylor to find new examples of previously learned shapes using the point distribution models. This approach is particularly useful if variations in shape and appearance are difficult to model as is often the case with non-rigid objects or when point-of-view perspective is involved. The method presented below is generally applicable to virtually any task involving deformable shape analysis.

Face tracking in real-time.

Speaker:
Vera Bakic
Date: October 3, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

A system for real-time tracking of the face of a workstation user is described. The long-term project is to provide for better man-machine interaction, teleconferencing, and face modeling and recognition. The program detects subject's eyes and nose. Using three feature points defined from the eyes and nose, the pose of the known subject can be computed using an existing P3P algorithm. In future work, continuous updating over time will enable the subject's pose to be tracked, and this will enable the use of head pose to position the mouse pointer on the desired location on the screen, or to condition the interpretation of facial gestures. Effort has been made so that the system can operate using normal lighting conditions, and simple eye-cameras that can be attached to most workstations.

Three-Dimensional Sound Localization Using a Compact Free-standing Non-Coplanar Array of Microphones.

Speaker:
Kamen Guentchev
Date: October 10, 1997
Time: 2:00-3:00
Location: EB1535
Note: Cancelled. Speaker is ill.
Abstract

The research in the area of sound localization has been largely con- centrated in developing stationary solutions - massive microphone arrays or groups of arrays placed on room walls. The portability and compactness of the devices has seldom been a goal. This approach per- mits the use of some approximations that simplify the problem with the assumptions made. It also provides for making up of some compu- ting power by using powerful stand-alone dedicated devices or by add- ing additional cues, e.g. visual aides. In the presented project the problem is approached with no restraints in the domain of recognition: the sensors occupy a compact area of the space, can be placed arbitrarily (free-standing) with three degrees of freedom and will detect sound of any type (with reasonable duration) coming from any location. Background noise is assumed to be present at reasonable levels. The traditional two-part approach is used for the recognition: first a number of cues is extracted (the features) then the 3D location of the source is estimated. In the first part in addition to the commonly used Interaural Time Difference (ITD) another acoustic cue is used - Inter- aural Level Differences (ILD) - one that is difficult to explicitly in- clude in the computations but that is proven to be used by humans for complementing the lack of information in the ITD with some frequencies. That is made possible by the approach used for the second part - the 3D localization - learning. The method requires training, covering the whole probed area with sufficient density and 3D precision. The preliminary experimental data shows angular precision higher than the training density, which approached 15 angular degrees and precision in distance around 44%. The implementation is made using a low-end A/D board with preamplifiers on the 4 condenser microphones. The software is written using Microsoft Visual C++ and is object-oriented.

Computer Vision Via Spatial Spectroscopy

Speaker:James M. Coggins (Ph.D. 1983, MSU CPS)
Date: October 16, 1997
Time: 2:00-3:30
Location: EB3400

Note: This is a special speaker engagement, not at the normal seminar time. Please make special note of the time and room.

Recent advances in medical image analysis have yielded a new vision of what computer vision is about. Spatial Spectroscopy will be introduced and related to several threads in image analysis research. A consequence of this research has been to revive the interest and importance of statistical pattern recognition methods as essential tools for investigating deep image structure in both mathematically-motivated and application-motivated ways. The development of new, more flexible, and more efficient density estimation techniques will be an essential short-term prerequisite for progress in this area.

About the speaker:
James Coggins is an Associate Professor and the Associate Chairman for Academic Affairs in the Department of Computer Science at the University of North Carolina at Chapel Hill. He received his Ph.D. in Computer Science from MSU in 1983 (Dr. Anil Jain was his dissertation advisor). His research is in image analysis, mostly for medical applications, especially involving statistical methods for extracting multi scale geometric descriptions of images.


Silversmith: An expert system for Nowegian Silver.

Speaker:
Karissa Miller
Date: October 17, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

Today, what is available from the personal workspace and office community is expanding to include a host of resources on the Internet; digital libraries are bringing large collections of resources to the desktop, and network bandwidth and computer power continue to push the boundaries of our abilities to communicate in real time and educate with computers. As technology has improved, so has the feasibility of automated methods that can aid in the dating and archiving of works of art. The photographic image or hand drawn paintings, once the standard medium for art historians, is being replaced by its more permanent, accurate, and increasingly affordable counterpart: the high resolution, digital image. Simultaneously, the ability of computerized techniques in recognizing textures, shapes and even complex objects has improved immensely. The computer is becoming a powerful tool in both recording and analyzing the world around us. Art history is just one of the fields that can benefit from these resources.

Kroepelien has investigated the possibility and demonstrated the feasibility of creating an expert system that can aid in the analysis and dating of works of art [1]. Silversmith [2] is the embodiment of such a system, involving both simple computer vision techniques and an expert system in analyzing and classifying old Norwegian silver tankards from the 15th and 16th centuries. Both rare and precious, these tankards are among the best craftsmanship to come out of Norway during this time. Expertise regarding these tankards is equally rare, involving a lifetime of study to develop the skills and keen eye necessary in their analysis. Therefore, this domain is an excellent testing ground for creating an expert system that can readily classify unknown works of art -- in this case, silver tankards.

This talk will provide a brief overview of Silversmith as well as some of the techniques and approaches I have used in creating the initial prototype expert system based on the knowledge acquired and recorded by Kroepelien. This includes discussion about the construction of a hierarchical classification system following the Generic Task approach of Chandrasekaran and others [3], the use of tools created by the Intelligent Systems Lab here at MSU, and simple computer vision techniques employed in analyzing images of the tankards to automatically extract several of the attributes used by the expert system in classifying unknown silver tankards. References
[1] Britt Kroepelien. ``From Style to Algorithm.'' PhD thesis, University of Bergen, Bergen, 1994.
[2] Britt Kroepelien. ``Norwegian silver, a computer-based expert system for silver tankards 1580-1740.'' Technical report, University of Bergen, Bergen, Norway, 1996.
[3] B. Chandrasekaran. ``Generic tasks in knowledge-based reasoning: high-level building blocks for expert system design.'' IEEE Expert, 1:23-30, Fall 1986.


Handprinted Chinese Character Recognition by Structure Verification.

Speaker:
Jinhui Liu
Date: October 24, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

Structural pattern recognition on Handprinted Chinese Character Recognition (HCCR) has important practical and theoretical value. Addressing to this direction, Jinhui Liu has conducted following research in his thesis:

SHOSLIF, past, present, and future.

Speaker:
Hamid Alavi
Date: October 31, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

The Self-Organizing Hierarchical Optimal Subspace Learning and Inference Framework (SHOSLIF) has been used in PRIP lab for a number of different projects. The self organization capability of SHOSLIF, makes it a well suited framework for autonomous learning. The Self-organizing Autonomous Incremental Learner (SAIL), an ongoing project in PRIP lab, is an attempt to develop intelligent behavior and is the successor of SHOSLIF. In this talk an overview and summary of SHOSLIF will be presented and the SAIL algorithm will be discussed.

Query by Video Clip.

Speaker:
Aditya Vailaya
Date: November 7, 1997
Time: 2:00-3:00
Location: EB1535
Abstract

Digital video libraries are generating tremendous interest in pattern recognition, computer vision, and multimedia research communities. Various systems have been proposed in the recent literature for content-based video retrieval based on automatically identified shots in the video. While there has been substantial progress in video database retrieval based on shot detection, an emphasis on retrieval based on a video clip is lacking. We propose two novel schemes for query by video clip. Our first approach, (i) retrieval based on key frames, follows the representation used in traditional systems, that of, identifying shots, computing key frames from a video, and then extracting image features around the key frames. Based on each key frame in the query, a similarity value (based on color, texture, and motion) is associated with the shots in the database video clips. The similarity values are then used to identify shots that are similar to the query key frames and lie in close proximity to each other. Boundaries marking these regions are then used to generate the set of retrieved video clips. In our second approach, (ii) retrieval using sub-sampled frames, we diverge from traditional approaches of representing video via key frames. Instead of identifying shot boundaries and extracting key frames, we uniformly sub-sample the query clip as well as the database video clips. Color and texture features are then extracted from the sub-sampled frames representing the clip. Retrieval is based on matching the combined color and texture features of the sub-sampled frames.

Vision and Motion Planning for a Mobile Robot under Uncertainty.

Speaker:
Dr. Jun Miura
Date: November 14, 1997
Time: 2:00-3:00
Location: EB1535

Contact information:
Dr. Jun Miura is from the Dept. Computer-Controlled Mechanical Systems, Osaka University, Japan.

Abstract

There has been an increasing interest in autonomous mobile robot which recognizes an environment with vision and moves without guidance of human operators. A key to realize such a robot is the ability to generate a plan of vision and motion operations so that a robot can efficiently reach the destination. One important issue to be considered in designing a planning algorithm for such a robot is uncertainty in visual information and robot motion. Planning algorithms which do not consider the uncertainties will likely fail or be inefficient in real world applications.

In this talk, I would like to present two examples which consider the uncertainties in planning of a vision-based mobile robot.

One is concerned with the generation of observation points in a 2-D workspace for selecting routes. In addition to the uncertainty of visual information, the cost of visual recognition is considered, and the optimal set of observation points is generated which minimizes the expectation of the total cost for reaching the destination.

The other is concerned with controlling the speed of visual feedback movement of a robot in following a given route. Based on both the uncertainty in self-localization of the robot and the location and the shape of nearby obstacles, the speed is adaptively controlled so that the robot can follow the route safely and efficiently.

In addition to the above topics, I would like to briefly present some of other research projects of our group (Prof. Shirai's group) at Osaka University.


Speaker: Scott Connell
Date: November 21, 1997
Time: 2:00-3:00
Location: EB1535

Classification of Text Documents

Speaker:
Yonghong Li
Date: December 5, 1997
Time: 2:00-3:00
Location: EB1535

Abstract:

The WWW is a huge information gallery which is widely distributed and dynamic in nature. However, the rapid growth of Internet has also made it increasingly difficult for users to locate the relevant information quickly on the Web. This has led to a great amount of interest to develop useful and efficient tools and software to assist users to search on the Web. Document retrieval, categorization routing, and filtering systems (agents) are often based on text classification. Document classification often requires working in high dimensional feature space and sparse data. We apply different learning algorithms (naive Bayes, nearest neighbor, decision tree and the subspace method) on the document classification (Yahoo news groups). We also study the effect of feature dimension reduction and combination of different classifiers.