Michigan State University
Spring 2012
CSE 802 - Pattern Recognition and Analysis, 3 credits

Tu, Th: 10:20 am - 11:40 pm, 3400 Engineering Building

Instructor Information

Instructor: Dr. Anil K. Jain
Office: 3143 EB
Office Hours: T, Th: 1- 2 pm or by appointment
Phone: 355-9282
Email: jain@cse.msu.edu

Teaching Assistant Information

Serhat Selcuk Bucak (bucakser@msu.edu)
Office: 3208 EB.
Office hours: Tue-Thu: 2 pm-3 pm


Course Information




Introduction

Pattern recognition techniques are used to automatically classify physical objects (handwritten characters, tissue samples) or abstract multidimensional patterns (n points in d dimensions) into known or possibly unknown categories. A number of commercial pattern recognition systems are available for character recognition, handwriting recognition, document classification, fingerprint classification, speech and speaker recognition, white blood cell (leukocyte) classification, military target recognition, etc. Most machine vision systems employ pattern recognition techniques to identify objects for sorting, inspection, and assembly. The design of a pattern recognition system requires the following modules: (i) sensing, (ii) feature extraction and selection, (iii) decision making and (iv) performance evaluation. The availability of low cost and high resolution sensors (e.g., digital cameras, microphones and scanners) and data sharing over the Internet have resulted in huge repositories of digitized documents (text, speech, image and video). Need for efficient archiving and retrieval of this data has fostered the development of pattern recognition algorithms in new application domains (e.g., text, image and video retrieval, bioinformatics, and face recognition).

Design of a pattern recognition system typically follows one of the following approaches: (i) template matching, (ii) statistical methods, (iii) syntactic methods and (iv) neural networks. This course will introduce the fundamentals of statistical pattern recognition with examples from several application areas. Techniques for analyzing multidimensional data of various types and scales along with algorithms for projection, dimensionality reduction, clustering and classification of data will be explained. The course will present various approaches to exploratory data analysis and classifier design so students can make judicious choices when confronted with real pattern recognition problems. It is important to emphasize that the design of a complete pattern recognition system for a specific application domain (e.g., remote sensing) requires domain knowledge, which is beyond the scope of this course. Students will use available MATLAB software library and implement some algorithms using their choice of a programming language.

Prerequisites

CSE 232, MTH 314, and STT 441, or equivalent courses.

Text Book

Duda, Hart and Stork, Pattern Classification, Second Edition, Wiley, 2001.

You may find the errata list useful.

A number of books on pattern recognition have been put on the Assigned Reading in the Engineering Library. In addition, a number of journals, including Pattern Recognition, Pattern Recognition Letters, IEEE Trans. Pattern Analysis & Machine Intelligence (PAMI), IEEE Trans. Geoscience & Remote Sensing, IEEE Trans. Image Processing, and IEEE Trans. Speech, Audio, and Language Processing routinely publish papers on pattern recognition theory and applications.


Assigned Reading

Following books are on hold in the Engineering library for assigned reading for CSE 802.
  • Theodoridis and Koutroumbas
  • Pattern Recognition
  • Christopher Bishop
  • Pattern Recognition and Machine Learning
  • Fukunaga
  • Introduction to Statistical Pattern Recognition
  • Devijver and Kittler
  • Pattern Recognition: A Statistical Approach
  • Tou and Gonzalez
  • Pattern Recognition Principles
  • Young and Calvert
  • Classification, Estimation and Pattern Recognition
  • Pavlidis
  • Structural Pattern Recognition
  • Gonzalez and Wintz
  • Syntactic Pattern Recognition
  • Oja
  • Subspace Methods of Pattern Recognition
  • Watanabe
  • Pattern Recognition: Human and Mechanical
  • Jain and Dubes
  • Algorithms for Clustering Data (Download the book)
  • Schalkoff
  • Pattern Recognition: Statistic, Structural and Neural Approaches

    Course Schedule

    Jan 10 Introduction to Pattern Recognition (Ch 1)

    Statistical Pattern Recognition: A Review

    Lecture slides: Pattern Recognition

    HW1 assigned

    Jan 12, 17, 19, 24 Statistical Decision Theory (Ch 2)

    Jan 17: HW2 assigned; HW1 due

    Lecture slides: Chapter 2

    An Introduction to Matlab.

    Jan 26 Statistical Decision Theory (Ch 2)
    Notes on Neyman-Pearson decision rule
    Notes on error rate of a linear discriminant function
    Jan 31, Feb 2 Parameter Estimation (Ch 3)
    Bayes Estimator for multivariate Gaussian density with unknown covariance matrices
    Bayes Estimator under quadratic loss

    Jan 31: HW3 assigned; HW2due

    Lecture slides: Chapter 3

    Lecture slides: Expectation Maximization

    Feb 7, 9 Parameter Estimation (Ch 3)

    Coin Tossing Example

    Feb 14 Curse of Dimensionality (Ch 3)

    A Problem of Dimensionality: A Simple Example

    Lecture slides: Curse of Dimensionality

    Feb 14: HW4 assigned; HW3 due

    Feb 16 Component analysis and Discriminants (Ch 3)
    Principle Component Analysis (PCA)
    Principal component analysis for face recognition.

    Lecture slides: Component Analysis & Discriminants

    Feb 21, Feb 23 Nonparametric Techniques (Ch 4)

    Lecture slides: Nonparametric Techniques

    Feb 28 Mid Term Exam
    Mar 1 Non-parametric Techniques (Ch 4)

    A Branch and Bound Algorithm for Computing k-Nearest Neighbors

    Decision Trees (Ch 8)

    lecture slides

    Hierarchical Classifier Design Using Mutual Information Sethi and Sarvarayudu

    Mar 1: HW5 assigned; HW4due

    Mar 1: Project Proposal Due (2 pages)

    Mar 6, 8 SPRING BREAK
    Mar 13, 15, 20 Linear Discriminant functions (Ch 5)

    Lecture slides: Linear discriminant functions

    Support Vector Machines

    Mar 15: HW6assigned; HW5due

    Mar 22, 27 Neural Networks (Ch 6)

    Lecture slides

    Lecture slides - 2

    audio file - 1 for Lecture slides - 2

    audio file - 2 for Lecture slides - 2

    audio file - 3 for Lecture slides - 2

    A note on comparing classifiers

    A Tutorial on Artificial Neural Networks

    Performance evaluation of pattern classifiers for handwritten character recognition

    Mar 27: HW7assigned, HW6due

    Mar 29, Apr 3, Apr 5 Error Rate Estimation, Bagging, Boosting (Ch 9)

    Classifier Combination (Ch 9)

    Lecture slides on classifier combination

    Combination of Multiple Classifiers Using Local Accuracy Estimates by Woods, Kegelmeyer and Bowyer

    Handwriting digits recognition by combining classifiers by van Breukelen, Duin, Tax and den Hartog

    Apr 10 Feature Selection

    Lecture slides on feature selection

    Branch and Bound Algorithm for Feature Subset Selection by Narendra and Fukunaga

    Feature Selection : Evaluation, Application, and Small Sample Performance by Jain and Zongker

    Apr 12, 17, 19 Unsupervised Learning, Clustering, and Multidimensional Scaling (Ch 10)

    April 12: HW7 due

    Lecture Slides: Introduction to clustering

    Lecture Slides: EM Algorithm

    Lecture Slides: Large scale clustering


    Data Clustering : 50 Years Beyond K-means (Download Presentation Slides here)

    Graph Theoretical Methods for Detecting and Describing Gestalt Clusters by C. Zahn
    A Nonlinear Mapping for Data Structure Analysis by J. Sammon
    Representation and Recognition of Handwritten Digits Using Deformable Templates by Jain and Zongker

    Apr 24 Semi-supervised learning

    Semi-supervised learning by Xiaojin Zhu

    BoostCluster by Liu, Jin and Jain

    Constrained K-means Clustering with Background Knowledge by Wagstaff et al.

    Semi-supervised clustering by seeding by Basu et al.

    Apr 26 Final Project Presentation

    Final Project Report Due

    April 30 FINAL EXAM, 7:45 a.m. - 9:45 a.m., 3400 EB

    Grading

    Course grade will be assigned based on scores on six homework assignments, two exams and one project. Weights for these three components are as follows: HW (25%), MID TERM EXAM (25%), FINAL EXAM (25%), PROJECT (25%). The cumulative score will be mapped to the letter grade as follows: 90% or higher: 4.0; 85% to 90%: 3.5; 80% to 85%: 3.0 and so on.

    Both the exams will be closed book. Makeup exams will be given ONLY if properly justified. Homework solutions must be turned in the class on the date they are due. Late homework solutions will not be accepted. Homework solutions should be either typed or neatly printed.

    Please refer to MSU's policy on the Integrity of Scholarship. All homework solutions must reflect your own work. Failure to do so will result in a grade of 0 in the course.


    Course Project

    The purpose of the project is to enable the students to get some hands-on experience in the design, implementation and evaluation of pattern recognition algorithms. To facilitate the completion of the project in a semester, it is advised that students work in teams of two. You are expected to evaluate different preprocessing, feature extraction, and classification (including bagging and boosting) approaches to achieve as high accuracy as possible on the selected classification task.

    The project will involve solving the following task:

    The project report should clearly explain the objective of the study, some background work on this problem, difficulty of the classification task, choice of representation, choice of classifiers, classifier combination strategies, error rate estimation, etc. For most of the classifiers, e.g., support vector machines and neural networks, software packages are available in the public domain. Feel free to use them. Emphasis of the project is to solve a practical and interesting pattern recognition problem using the tools that you have learnt in this course. It would be instructive to see how close you can come to the state-of-the-art accuracy on this database. Use the projection algorithms to display 2- and 3-dimensional representations of the multidimensional patterns.

    Some tips for your project

    Some guidelines for writing the report:

    Related Publications (MSU Access is provided for the non-public domain articles; click on the links to be routed through magic.msu.edu)

    Compliation of some of the existing results on MNIST database:


    CSE 802 Home Page