Current Projects Abstracts

MRF Models of Faces

Dr Sarat Dass
Xiaoguang Lu
Dr. Anil Jain

The spatial distribution of gray level intensities in an image can be naturally modeled using Markov Random Field (MRF) models. We develop and investigate the performance of face detection algorithms derived from MRF considerations. For enhanced detection, the MRF models are defined for every permutation of site indices in the image. We find the optimal permutation that provides maximum discriminatory power to identify faces from nonfaces. The MRF models successfully detect faces in a number of test images in real time.

  • Sarat C. Dass and A. K. Jain, "Markov Face Models", The Eighth IEEE International Conference on Computer Vision (ICCV), pp. 680-687, Vancouver, Canada, July 9-12, 2001.

    Clustering and Feature Selection

    Martin Law
    Dr. Anil Jain
    Dr. Mario Figueiredo

    This work proposes an unsupervised algorithm for learning a finite mixture model from multivariate data. The adjective "unsupervised" is justified by two properties of the algorithm: (i) it is capable of selecting the number of components, and (ii) unlike the standard expectation-maximization (EM) algorithm,it does not require careful initialization. The proposed method also avoids another drawback of EM for mixture fitting: the possibility of convergence towards a singular estimate at the boundary of the parameter space. The novelty of our approach is that we do not use a model selection criterion to choose one among a set of preestimated candidate models; instead, we seamlessly integrate estimation and model selection in a single algorithm. Our technique can be applied to any type of parametric mixture model for which it is possible to write an EM algorithm. In our first paper, we illustrate it with experiments involving Gaussian mixtures. These experiments testify for the good performance of our approach. This approach is extended to perform feature selection -- the selection of "good" variables -- for learning a mixture model in an unsupervised setting. Feature selection in unsupervised learning is much more difficult than its counter-part in supervised learning because of the lack of class labels. By treating the relevance of each feature as a Bernoulli random variable, we obtain an EM algorithm that estimate both the number of components and the importance of the features simultaneously. A complimentary approach based on a "wrapper" on the standard EM mixture learning algorithm is also proposed for feature selection in unsupervised learning. The optimal feature subset size is determined automatically by the entropy of assignment, instead of manually adjusted. Our experimental results show that the proposed methods can be useful for many real world data sets.

  • Martin Law, Mario Figueiredo and Anil Jain, " Feature Selection in Mixture-Based Clustering". In Advances in Neural Information Processing Systems 15 (NIPS 2002), Vancouver, Dec 2002.
  • Mario Figueiredo and Anil Jain, " Unsupervised Learning of Finite Mixture Models", IEEE Transactions on PAMI, Vol. 24, No. 3, March 2002, pp. 381-396.

    Dr. Anil Jain
    Dr. Ana Fred
    Alexander Topchy

    We explore the idea of evidence accumulation for combining the results of multiple clusterings. Initially, n d-dimensional data is decomposed into a large number of compact clusters; the K-means algorithm performs this decomposition, with several clusterings obtained by N random initializations of the K-means. Taking the cooccurrences of pairs of patterns in the same cluster as votes for their association, the data partitions are mapped into a co-association matrix of patterns. This nxn matrix represents a new similarity measure between patterns. The final clusters are obtained by applying a MST-based clustering algorithm on this matrix. Results on both synthetic and real data show the ability of the method to identify arbitrary shaped clusters in multidimensional data.

  • Alexander Topchy, Anil K. Jain, and William Punch, "Combining Multiple Weak Clusterings", To appear in Proc. IEEE International Conf. Data Mining, Melbourne, Florida, USA, November 2003.
  • Ana L.N. Fred and Anil K. Jain, "Evidence Accumulation Clustering based on the K-means algorithm", Proc. Structural and Syntactic Pattern Recognition (SSPR), Windsor, Canada, August 2002.
  • Ana L.N. Fred and Anil K. Jain, "Data Clustering Using Evidence Accumulation", Proc. International Conference on Pattern Recognition (ICPR), Quebec City, August 2002.
  • A. K. Jain, M.N. Murthy and P.J. Flynn, Data Clustering: A Review, ACM Computing Reviews, Nov 1999.
  • A. K. Jain and R. C. Dubes. Algorithms for clustering data. Prentice Hall, 1988.

    A more detailed clustering page can be found here.

    Face Modeling for Recognition

    Vincent Hsu
    Xiaoguang Lu
    Dirk Colbry
    Dr. Anil Jain

    Project WWW page

    3D Human face models have been widely used in applications such as face recognition, facial expression recognition, human action recognition, head tracking, facial animation, video compression/coding, and augmented reality. Modeling human faces provides a potential solution to the variations encountered on human face images. We propose a method of modeling human faces based on a generic face model (a triangular mesh model) and individual facial measurements containing both shape and texture information. The modeling method adapts a generic face model to the given facial features, extracted from registered range and color images, in a global-to-local fashion. It iteratively moves the vertices of the mesh model to smoothen the non-feature areas, and uses the 2.5D active contours to refine feature boundaries. The resultant face model has been shown to be visually similar to the true face. Initial results show that the constructed model is quite useful for recognizing profile views.

  • X. Lu, Y. Wang and A. K. Jain, "Combining Classifiers for Face Recognition", Proc. ICME 2003, IEEE International Conference on Multimedia & Expo, vol. III, pp. 13-16, Baltimore, MD, July 6-9, 2003.
  • X. Lu and A. K. Jain, "Resampling for Face Recognition", Proc. of 4th Int'l Conf. on Audio- and Video-Based Biometric Person Authentication (AVBPA), pp. 869-877, Guildford, UK, June 9-11, 2003.

    Face Detection in Color Images

    Vincent Hsu
    Dr. Anil Jain

    Project WWW page

    Human face detection is often the first step in applications such as video surveillance, human computer interface, face recognition, and image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Our method detects skin regions over the entire image, and then generates face candidates based on the spatial arrangement of these skin patches. The algorithm constructs eye, mouth, and boundary maps for verifying each face candidate. Experimental results demonstrate successful detection over a wide variety of facial variations in color, position, scale, rotation, pose, and expression from several photo collections.

    Fingerprint Mosaicking

    Arun Ross
    Dr. Anil Jain

    Project WWW page

    It has been observed that the reduced contact area offered by solid-state fingerprint sensors do not provide sufficient information (e.g., minutiae) for high accuracy user verification. Further, multiple impressions of the same finger acquired by these sensors, may have only a small region of overlap thereby affecting the matching performance of the verification system. To deal with this problem, we suggest a fingerprint mosaicking scheme that constructs a composite fingerprint image using multiple impressions. In the proposed algorithm, two impressions of a finger are initially aligned using the corresponding minutiae points. This alignment is used by the well-known iterative closest point algorithm (ICP) to compute a transformation matrix that defines the spatial relationship between the two impressions. The transformation matrix is used in two ways: (a) the two impressions are stitched together to generate a composite image. Minutiae points are then detected in this composite image. (b) the minutia maps obtained from each of the individual impressions are integrated to create a larger minutia map. The availability of a composite template improves the performance of the fingerprint matching system as is demonstrated in our experiments.

  • A. K. Jain and A. Ross, " Fingerprint Mosaicking", Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , Orlando, Florida, May 13 - 17, 2002.

    Hybrid Fingerprint Matcher

    Arun Ross
    Dr. Anil Jain

    Project WWW page

    Most fingerprint matching systems rely on the distribution of minutiae on the fingertip to represent and match fingerprints. While the ridge flow pattern is generally used for classifying fingerprints, it is seldom used for matching. This work describes a hybrid fingerprint matching scheme that uses both minutiae and ridge flow information to represent and match fingerprints. A set of 8 Gabor filters, whose spatial frequencies correspond to the average inter-ridge spacing in fingerprints, is used to capture the ridge strength at equally spaced orientations. A square tessellation of the filtered images is then used to construct an eight-dimensional feature map, called the ridge feature map. The ridge feature map along with the minutiae set of a fingerprint image is used for matching purposes. The proposed technique has the following features: (i) the entire image is taken into account in constructing the ridge feature map, and every tessellated cell is equally weighted; (ii) minutiae matching is used to determine the affine transformation parameters relating the query and the template images for ridge feature map extraction; (iii) filtering and ridge feature map extraction are implemented in the frequency domain thereby speeding up the matching process; (iv) filtered query images are cached to greatly increase the one-to-many matching speed. The hybrid matcher performs better than a minutiae-based fingerprint matching system. The genuine accept rate of the hybrid matcher is observed to be ~10% higher than that of a minutiae-based system at low FAR values. Fingerprint verification (one-to-one matching) using the hybrid matcher on a Pentium III, 800 MHz system takes ~1.4 seconds, while fingerprint identification (one-to-many matching) involving 1,000 templates takes ~0.2 seconds per match.

  • A. Ross, A. K. Jain, and J. Reisman, "A Hybrid Fingerprint Matcher", Pattern Recognition, Vol. 36, No. 7, pp. 1661-1673, July 2003.
  • A. Ross, J. Reisman and A. K. Jain, " Fingerprint Matching Using Feature Space Correlation", Proc. of Post-ECCV Workshop in Biometric Authentication, LNCS 2359, pp. 48-57, Copenhagen, Denmark, June 1, 2002.
  • A. K. Jain, A. Ross and S. Prabhakar, "Fingerprint Matching Using Minutiae and Texture Features", Proc. of Int'l Conference on Image Processing (ICIP), pp.282-285, Thessaloniki, Greece, Oct 7 - 10, 2001.

    Multimodal Biometrics

    Arun Ross
    Dr. Anil Jain

    Project WWW page

    A simple biometric system has a sensor module, a feature extraction module and a matching module. The performance of a biometric system is largely affected by the reliability of the sensor used and the degree of freedom offered by the features extracted. Also, if the biometric trait being sensed or measured is noisy (a fingerprint with a scar or a voice altered by a cold, for example), the resultant confidence score (or matching score) computed by the matching module may not be reliable. Simply put, the matching score generated by a noisy input has a large confidence interval. This problem can be addressed by installing multiple sensors that capture different biometrics. Such systems known as multimodal biometric systems are expected to be more reliable due to the presence of multiple pieces of evidence. However an intelligent scheme is required to fuse the decisions churned out by the individual sensors.

    In this work we attempt to deal with the problem of decision fusion by first building a bimodal biometric system and then devising various schemes to integrate the outputs of the two sensors. The proposed system uses the fingerprint and voice features of an individual for verification purposes.

  • A. Ross and A. K. Jain, " Information Fusion in Biometrics", Pattern Recognition Letters, Vol. 24, Issue 13, pp. 2115-2125, September, 2003.
  • A. K. Jain and A. Ross, " Learning User-specific Parameters in a Multibiometric System", Proc. International Conference on Image Processing (ICIP), Rochester, New York, September 22-25, 2002.

    Online Script Recognition

    Anoop Namboodiri
    Dr. Anil Jain

    Project WWW Page

    Automatic identification of handwritten script facilitates many important applications such as automatic transcription of multi-lingual documents and search for documents on the web containing a particular script. The increase in usage of handheld devices which accept handwritten input is creating a huge volume of handwritten data. This work proposes a method to classify words and lines in an on-line handwritten document into one of the six scripts: Arabic, Cyrillic, Devnagari, Han, Hebrew or Roman. The classification is based on 11 different spatial and temporal features extracted from the strokes of the words. The proposed system attained an overall classification accuracy of 85% at the word level with 5-fold cross validation. The classification accuracy improves to 96% as the number of words in the test sample is increased to four and to 96.5% for complete text lines, consisting of an average of seven words.

    We present a hierarchical approach for extracting homogeneous regions in on-line documents. The problem of identifying and processing ruled and unruled tables, text and drawings is addressed. The on-line document is first segmented into regions with only text stroke and regions with both text and non-text strokes. The text region is further classified as unruled table or plain text. Stroke clustering is used to segment the non-text regions. Each non-text segment is then classified as drawing, ruled table or underlined keyword using stroke properties. The individual regions are processed and the results are assembled to identify the structure of the on-line document.

    Digital Watermarking of Fingerprint Images

    Umut Uludag
    Dr. Anil Jain

    Watermarking of digital media has gained considerable attention in the last years as a means of copyright protection and content verification. Watermarking of fingerprint images aims to embed watermark information to the fingerprint image without decreasing the fingerprint identification-verification performance. In this project, we are working on such watermarking methods to increase the security of the fingerprints.

  • A. K. Jain and U. Uludag, "Hiding Fingerprint Minutiae in Images", Proc. AutoID 2002, 3. Workshop on Automatic Identification Advanced Technologies, pp. 97-102, Tarrytown, New York, USA, March 14-15, 2002.

    Dental Biometrics

    Hong Chen
    Anil K. Jain

    Dental Biometrics: The main purpose of forensic dentistry is to identify deceased individuals for whom other means of identification (e.g., fingerprint, face, etc) are not available. We try to identify people using their post-mortem (PM) and ante-mortem (AM) radiographs. In other words, given a PM radiograph, we search the database to locate a matching AM radiograph. The variant quality of the radiographs requires us to perform preprocessing procedures such as image enhancement and restoration. The matching is based on the contour of the teeth and its image intensity. Currently, we're attempting to use the striate and trabecular patterns present in the teeth. A combination method is expected to further improve the matching accuracy.

  • A. K. Jain, H. Chen and S. Minut, "Dental Biometrics: Human Identification Using Dental Radiographs", Proc. of 4th Int'l Conf. on Audio- and Video-Based Biometric Person Authentication (AVBPA), pp. 429-437, Guildford, UK, June 9-11, 2003.

    Recent Projects Abstracts

    Assisted Identification in Digital Color Microbial Images

    Feng-I Liu
    Dr. George Stockman

    Microbes stained with special fluorescein and chemicals exhibit different colors. The color indicates the state of the microbes during an important activity like metabolism or fission. The morphtype, biovolumn, and aggregate situation of those microbes in some special colors are important for further biological analysis. The color image itself has a noisy background and a complex foreground. Because the amount of microbes in an image is large, a lot of time is needed to manually extract information from the image. The goal of this work is to identify the microbes present in the noisy image and to extract useful information about them. The proposed system identifies the microbes in a digitized color image using an interactive user interface. There is no fixed color range for a specific color stain; microbes with the same reaction to the chemical might show different but related colors. Color segmentation (with local thresholding and region growing) is applied to eliminate the background noise and select the region of interest. The microbes present in the region of interest might be connected or even overlap with each other. The connection and overlapping problems distort the extracted information: the microbe count will be incorrect and the biovolumn of microbes will be underestimated. The morphtype and color information are used to split the connected microbes. To distinguish between overlapping microbes and to count the number of microbes correctly, further morphtype analysis is required.

    Minutiae Verification and Classification for Fingerprint Matching

    Salil Prabhakar
    Dr. Anil Jain

    Project WWW page

    Raw image data offer rich source of information for matching and classification. For simplicity of pattern recognition system design, a sequential approach consisting of sensing, feature extraction and matching is conventionally adopted where each stage transforms a particular component of information relatively independently. The interaction between these modules is limited. Some of the errors in the end-to-end sequential processing can be easily eliminated especially for the feature extraction stage by revisiting the original image data. We propose a feedback path for the feature extraction stage, followed by a feature refinement stage for improving the matching performance. This performance improvement is illustrated in the context of a minutiae-based fingerprint verification system. We show that a minutia verification stage based on reexamining the gray-scale profile in a detected minutia's spatial neighborhood in the sensed image can improve the matching performance by ~4% on our database. Further, we show that a feature refinement stage which assigns a class label to each detected minutia (ridge ending and ridge bifurcation) before matching can also improve the matching performance by ~3%. A combination of feedback (minutia verification) in the feature extraction phase and feature refinement (minutia classification) improves the overall performance of the fingerprint verification system by ~8%.

    Automatic Surveillance Using Omnidirectional and Active Cameras

    Dan Gutchess
    Dr. Anil Jain

    We are developing a real-time automated surveillance system which uses an omnidirectional video camera in combination with multiple active cameras. Tracking of multiple subjects in an indoor environment is performed using omnidirectional video as input. The world coordinates of each subject in the room are estimated in order to direct the attention of one or more pan-tilt-zoom cameras. The system automatically controls these cameras for the purpose of obtaining high resolution images and video sequences of subjects. In particular, we demonstrate that high-quality facial images may be extracted from the images captured by the system. The automatic acquisition of such images makes this system useful for experiments involving face and action recognition.

    Detecting, Tracking and Interpreting Faces

    Vera Bakic
    Dr. George Stockman

    Project WWW page

    Real-time Tracking of Face Features

    A non-intrusive real-time program is developed which detects the eyes and nose of a moving workstation user at a rate of between 10 and 30 Hertz. The program creates a base facility for other capabilities such as detecting gaze direction and facial gestures, creating face models, and normalizing for face recognition. A skin color model is used along with geometric knowledge about the face and weak assumptions about the lighting. Good results are reported over various conditions, including facial hair, 3D motion, clothing color, and use of eyeglasses. Good performance has been demonstrated with dozens of subjects on a low end SGI workstation with an eye-camera to acquire images.

    Our work is directed toward a general capability to detect and track a human face as it moves in a 3D workspace. Having achieved this capability, it can then be used to enable others. For example, 3D pose can be used directly for Human-Computer Interface (HCI) or for evaluation of how humans explore computer displays or virtual environments. An extension of the system that enables to determine into which region on the workstation screen the user is looking is under development. The news system will enable to the user to issue commands to the computer using head movements and gaze direction.

    Visual Learning

    Nicolae Duta
    Dr. Anil Jain

    Project WWW page

    Building a trainable object detection/segmentation/matching system with applications in automatic medical diagnosis and personal identity verification.

    Image and Video Databases

    Aditya Vailaya
    Dr. Anil Jain

    Project WWW page

    Query By Video Clip

    Typical digital video search is based on queries involving a single shot. We generalize this problem by allowing queries that involve a video clip (say a 10 sec video segment). We propose two schemes for query by video clip: (i) retrieval based on key frames follows the traditional representation of identifying shots, computing key frames from a video, and then extracting image features around the key frames. Based on each key frame in the query, a similarity value (using color, texture, and motion) is associated with the key frames in the database video. Consecutive key frames in the database video that are highly similar to the query key frames are then used to generate the set of retrieved video clips. (ii) In retrieval using sub-sampled frames, we uniformly sub-sample the query clip as well as the database video. Retrieval is based on matching color and texture features of the sub-sampled frames. Initial experiments on two video databases (basketball video with approximately 16,000 frames and a CNN news video with approximately 20,000 frames) show promising results. Experiments using segments from one basketball video as query and a different basketball video as the database show that the feature representation and matching schemes are robust. We are currently investigating methods for improving the performance of the system using semantic knowledge of the given domain, object segmentation and tracking, detection of text and faces, and combining the various matching schemes.

    Semantic Classification in Image Databases

    Due to the huge amount of potentially interesting documents available over the Internet, searching for relevant information has become very difficult. Since image and video are a major source of these data, grouping images into (semantically) meaningful categories using low-level visual features is an important (and challenging) problem in content-based image retrieval. Using Bayesian classifiers, we attempt to capture high-level concepts from low-level image features. Specifically, we have developed Bayesian classifiers for semantic image classification (indoor vs. outdoor, city vs. landscape, and sunset vs. forest vs. mountain), image orientation detection, and object detection (detecting regions of sky and vegetation in outdoor images). We demonstrate that a small codebook (the optimal codebook size is selected using a modified MDL criterion) extracted from a learning vector quantizer can be used to estimate the class-conditional densities of the observed features needed for image classification. We have developed an incremental learning paradigm, a feature selection scheme, a rejection scheme, and a classifier combination strategy using bagging to improve classifier performance. Empirical results on a large database (24,000 images) show that semantic categorization and organization of the database using the proposed classification schemes improves both retrieval accuracy and efficiency.

  • A. Vailaya, H.-J. Zhang, C.-J. Yang, F.-I. Liu, and A. K. Jain, "Automatic Image Orientation Detection" , IEEE Transactions on Image Processing, vol. 11, no. 7, pp 746-755, July, 2002.
  • A. Vailaya, M. Figueiredo, A. K. Jain, and H.-J. Zhang, "Image Classification for Content-Based Indexing" , IEEE Transactions on Image Processing, vol. 10, no. 1, pp 117-130, January, 2001.

    3D Object Recognition and Registration

    Chitra Dorai
    Dr. Anil Jain

  • C. Dorai and A. K. Jain, "COSMOS - A Representation Scheme for 3D Free-Form Objects", IEEE Trans. on PAMI, Vol. 19, No. 10, pp. 1115-1130, Oct 1997.
  • C. Dorai and A. K. Jain, "Shape Spectrum Based View Grouping and Matching of 3D Free-Form Objects", IEEE Trans. on PAMI, Vol. 19, No. 10, pp. 1139-1146, Oct 1997.
  • C. Dorai, Gang Wang, A. K. Jain and C. Mercer, " Registration and Integration of Multiple Object Views for 3D Model Construction", IEEE Trans. on PAMI, Vol. 20, No. 1, pp. 83-89, Jan 1998.

    Deformable Models For Object Matching

    Yu Zhong
    Dr. Anil Jain

    We propose a general object localization and retrieval scheme based on object shape using deformable templates. Prior knowledge of an object shape is described by a prototype template which consists of the representative contour/edges, and a set of probabilistic deformation transformations on the template. A Bayesian scheme, which is based on this prior knowledge and the edge information in the input image, is employed to find a match between the deformed template and objects in the image. Computational efficiency is achieved via a coarse-to-fine implementation of the matching algorithm. Our method has been applied to retrieve objects with a variety of shapes from images with complex background. The proposed scheme is invariant to location, rotation, and moderate scale changes of the template.

  • Yu Zhong, Anil K. Jain and M.-P. Dubuisson-Jolly, " Object Tracking Using Deformable Templates", IEEE Transactions on PAMI, Vol. 22, No. 5, 2000, pp. 544-549.

  • Anil Jain, Yu Zhong and Sridhar Lakshmanan, " Object Matching Using Deformable Templates", IEEE Transactions on PAMI, Vol. 18, No. 3, 1996, pp. 267-278.