Michigan State University
Spring 2013
CSE 802 - Pattern Recognition and Analysis, 3 credits

Tu, Th: 10:20 am - 11:40 pm, 3400 Engineering Building

Instructor Information

Instructor: Dr. Anil K. Jain
Office: 3143 EB
Office Hours: T, Th: 1- 2 pm or by appointment
Phone: 355-9282
Email: jain@cse.msu.edu

Teaching Assistant Information

Serhat Selcuk Bucak (bucakser@msu.edu)
Office: 3208 EB.
Office hours: Tue-Thu: 1:30 pm-3 pm

Radha Chitta (chittara@msu.edu)
Office: 3208 EB.
Office hours: Tue-Thu: 1:30 pm-3 pm


Course Information




Introduction

Pattern recognition techniques are used to automatically classify physical objects (handwritten characters, tissue samples) or abstract multidimensional patterns (n points in d dimensions) into known or possibly unknown categories. A number of commercial pattern recognition systems are available for character recognition, handwriting recognition, document classification, fingerprint classification, speech and speaker recognition, white blood cell (leukocyte) classification, military target recognition, etc. Most machine vision systems employ pattern recognition techniques to identify objects for sorting, inspection, and assembly. The design of a pattern recognition system requires the following modules: (i) sensing, (ii) feature extraction and selection, (iii) decision making and (iv) performance evaluation. The availability of low cost and high resolution sensors (e.g., digital cameras, microphones and scanners) and data sharing over the Internet have resulted in huge repositories of digitized documents (text, speech, image and video). Need for efficient archiving and retrieval of this data has fostered the development of pattern recognition algorithms in new application domains (e.g., text, image and video retrieval, bioinformatics, and face recognition).

Design of a pattern recognition system typically follows one of the following approaches: (i) template matching, (ii) statistical methods, (iii) syntactic methods and (iv) neural networks. This course will introduce the fundamentals of statistical pattern recognition with examples from several application areas. Techniques for analyzing multidimensional data of various types and scales along with algorithms for projection, dimensionality reduction, clustering and classification of data will be explained. The course will present various approaches to exploratory data analysis and classifier design so students can make judicious choices when confronted with real pattern recognition problems. It is important to emphasize that the design of a complete pattern recognition system for a specific application domain (e.g., remote sensing) requires domain knowledge, which is beyond the scope of this course. Students will use available MATLAB software library and implement some algorithms using their choice of a programming language.

Prerequisites

CSE 232, MTH 314, and STT 441, or equivalent courses.

Text Book

Duda, Hart and Stork, Pattern Classification, Second Edition, Wiley, 2001.

You may find the errata list useful.

A number of books on pattern recognition have been put on the Assigned Reading in the Engineering Library. In addition, a number of journals, including Pattern Recognition, Pattern Recognition Letters, IEEE Trans. Pattern Analysis & Machine Intelligence (PAMI), IEEE Trans. Geoscience & Remote Sensing, IEEE Trans. Image Processing, and IEEE Trans. Speech, Audio, and Language Processing routinely publish papers on pattern recognition theory and applications.


Assigned Reading

Following books are on hold in the Engineering library for assigned reading for CSE 802.
  • Theodoridis and Koutroumbas
  • Pattern Recognition
  • Christopher Bishop
  • Pattern Recognition and Machine Learning
  • Fukunaga
  • Introduction to Statistical Pattern Recognition
  • Devijver and Kittler
  • Pattern Recognition: A Statistical Approach
  • Tou and Gonzalez
  • Pattern Recognition Principles
  • Young and Calvert
  • Classification, Estimation and Pattern Recognition
  • Pavlidis
  • Structural Pattern Recognition
  • Gonzalez and Wintz
  • Syntactic Pattern Recognition
  • Oja
  • Subspace Methods of Pattern Recognition
  • Watanabe
  • Pattern Recognition: Human and Mechanical
  • Jain and Dubes
  • Algorithms for Clustering Data (Download the book)
  • Schalkoff
  • Pattern Recognition: Statistic, Structural and Neural Approaches

    Course Schedule

    Jan 8 Introduction to Pattern Recognition (Ch 1)

    Statistical Pattern Recognition: A Review

    Lecture slides: Pattern Recognition

    HW1 assigned

    HW1 Solutions

    Jan 10, 15, 17 Statistical Decision Theory (Ch 2)

    Jan 15: HW2 assigned(Textbook problems); HW1 due

    HW2 Solutions

    Lecture slides: Chapter 2

    Notes on Bayes Classification

    An Introduction to Matlab.

    Jan 22 Statistical Decision Theory (Ch 2)

    Lecture slides:

    Neyman-Pearson Rule

    Linear Discriminant Functions

    Jan 24, 29 Parameter Estimation (Ch 3)
    Bayes Estimator for multivariate Gaussian density with unknown covariance matrices
    Bayes Estimator under quadratic loss

    Jan 24: HW3 assigned (Textbook problems); HW2 due

    HW3 Solutions

    Lecture slides: Chapter 3

    Jan 31 Parameter Estimation (Ch 3)

    Curse of Dimensionality (Ch 3)

    Coin Tossing Example

    A Problem of Dimensionality: A Simple Example

    Lecture slides: Curse of Dimensionality

    Feb 5,7 Component analysis and Discriminants (Ch 3)
    Principle Component Analysis (PCA)
    Principal component analysis for face recognition.

    Lecture slides: Component Analysis & Discriminants

    Feb 5: HW4 assigned; HW3 due

    Feb 12, 14, 19 Nonparametric Techniques (Ch 4)

    Lecture slides: Nonparametric Techniques

    A Branch and Bound Algorithm for Computing k-Nearest Neighbors

    Feb 19: HW5 assigned (Textbook problems); HW4 due

    HW5 Solutions

    Feb 21

    Decision Trees (Ch 8)

    lecture slides

    Hierarchical Classifier Design Using Mutual Information -Sethi and Sarvarayudu

    Feb 26 Mid Term Exam
    Feb 28

    Project Discussion

    Slides: Image Categorization

    Mar 5, 7 SPRING BREAK
    Mar 12 Linear Discriminant functions (Ch 5)

    Lecture slides: Linear discriminant functions

    Mar 14,19

    Linear Discriminant functions (Ch 5)

    Support Vector Machines

    Lecture slides: Part1   Part2

    Mar 14: HW5 due

    Mar 19: HW6 assigned (Textbook problems); Project Proposal Due (2 pages)

    Mar 21, 26 Neural Networks (Ch 6)

    Lecture slides

    Lecture slides - 2

    audio file - 1 for Lecture slides - 2

    audio file - 2 for Lecture slides - 2

    audio file - 3 for Lecture slides - 2

    A note on comparing classifiers

    A Tutorial on Artificial Neural Networks

    Performance evaluation of pattern classifiers for handwritten character recognition

    Mar 28, Apr 2 Error Rate Estimation, Bagging, Boosting (Ch 9)

    Apr 2: HW6 due

    Apr 4 Classifier Combination (Ch 9)

    Lecture slides on classifier combination

    Combination of Multiple Classifiers Using Local Accuracy Estimates by Woods, Kegelmeyer and Bowyer

    Handwriting digits recognition by combining classifiers by van Breukelen, Duin, Tax and den Hartog

    Apr 9 Feature Selection

    Lecture slides on feature selection

    Branch and Bound Algorithm for Feature Subset Selection by Narendra and Fukunaga

    Feature Selection : Evaluation, Application, and Small Sample Performance by Jain and Zongker

    Apr 11, 16, 18 Unsupervised Learning, Clustering, and Multidimensional Scaling (Ch 10)

    April 11: Test data for project released

    Lecture Slides: Introduction to clustering

    Lecture Slides: EM Algorithm

    Lecture Slides: Large scale clustering

    Talk on Large Scale Clustering


    Data Clustering : 50 Years Beyond K-means (Download Presentation Slides here)

    Graph Theoretical Methods for Detecting and Describing Gestalt Clusters by C. Zahn
    A Nonlinear Mapping for Data Structure Analysis by J. Sammon
    Representation and Recognition of Handwritten Digits Using Deformable Templates by Jain and Zongker

    Apr 23 Semi-supervised learning

    Semi-supervised learning by Xiaojin Zhu

    BoostCluster by Liu, Jin and Jain

    Constrained K-means Clustering with Background Knowledge by Wagstaff et al.

    Semi-supervised clustering by seeding by Basu et al.

    Apr 25 Final Project Presentation

    Final Project Report Due

    May 1 FINAL EXAM, 7:45 a.m. - 9:45 a.m., 3400 EB

    Grading

    Course grade will be assigned based on scores on six homework assignments, two exams and one project. Weights for these three components are as follows: HW (25%), MID TERM EXAM (25%), FINAL EXAM (25%), PROJECT (25%). The cumulative score will be mapped to the letter grade as follows: 90% or higher: 4.0; 85% to 90%: 3.5; 80% to 85%: 3.0 and so on.

    Both the exams will be closed book. Makeup exams will be given ONLY if properly justified. Homework solutions must be turned in the class on the date they are due. Late homework solutions will not be accepted. Homework solutions should be either typed or neatly printed.

    Please refer to MSU's policy on the Integrity of Scholarship. All homework solutions must reflect your own work. Failure to do so will result in a grade of 0 in the course.


    Course Project

    The purpose of the project is to enable the students to get some hands-on experience in the design, implementation and evaluation of pattern recognition algorithms. To facilitate the completion of the project in a semester, it is advised that students work in teams of two. You are expected to evaluate different preprocessing, feature extraction, and classification (including bagging and boosting) approaches to achieve as high accuracy as possible on the selected classification task. The task for the project is described here.

    The project report should clearly explain the objective of the study, some background work on this problem, difficulty of the classification task, choice of representation, choice of classifiers, classifier combination strategies, error rate estimation, etc. For most of the classifiers, e.g., support vector machines and neural networks, software packages are available in the public domain. Feel free to use them. Emphasis of the project is to solve a practical and interesting pattern recognition problem using the tools that you have learnt in this course. It would be instructive to see how close you can come to the state-of-the-art accuracy on this database. Use the projection algorithms to display 2- and 3-dimensional representations of the multidimensional patterns.

    Some tips for your project


    CSE 802 Home Page