CSE842: Natural Language Processing

Spring 2011

Location:  EB3400
Time:  Monday and Wednesday, 12:40-2:00pm
Professor: Joyce Chai, 2138 Engineering Building, (517)432-9239, jchai AT cse DOT msu DOT edu
Office Hours: Monday: 2:30-4:30pm, or by appointment

Course Description:

The field of Natural Language Processing (NLP) is primarily concerned with getting computers to perform useful and interesting tasks with human languages, for example, automatically interpret, generate, and learn natural language. In the past twenty years, the rise of the world wide web, together with new advancements in language technologies, has created tremendous opportunities for exciting NLP applications. This course provides an introduction to the state of the art in modern NLP technologies. In particular, the issues to be discussed include: syntax, semantics, discourse, and their applications in information extraction, machine translation, and dialogue systems. These topics will be examined through reading, discussion, and hands-on experience with NLP systems. 

Text book:

Required: Speech and Language Processing, an introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, second edition, by Daniel Jurafsky and James Martin, Prentice Hall, .  ISBN-13: 978-0131873216.  It should be available from the MSU bookstore, as well as from Amazon and other online providers. 

Optional:  

1. Foundations of Statistical Natural Language Processing, by Christopher D. Manning and Hinrich Schutze. MIT Press. ISBN 0-262-13360-1

2. Semisupervised Learning for Computational Linguistics, by Steven Abney. Chapman & Hall/CRC. ISBN 1-58488-559-9

All three books are reserved in the Engineering Library. 

 

Course Grades:

 

Class Participation and Quizzes 5%

Three  Homework Assignments

60%
Final Project  35%

Homework and Final Project:

The work in this course consists of three homework assignments (including both a written part and a programming part). The written portion must be turned in at the beginning of the lecture on the day it is due. The programming part is due at 11:59pm of the due date (submitted through handin facility). No late homework will be accepted.  A set of topics will be provided for the final projects.

Assigned date Due date
Homework 1 Jan. 19 Feb. 14
Homework 2 Feb. 14 Mar. 2
Homework 3 Mar. 2 March 23
Project Proposal - March 28
Project Report - May 8

Tentative Schedule of Topics

Week Class Date Topic Readings
1 Jan 10 Introduction 

Chapter 1, 2

  Jan 12 Morphological Parsing Chapter  3
2 Jan 17 Martin Luther King Day, no class  
  Jan 19 POS Tagging 

Chapter 5

Homework 1 Assigned

3 Jan 24 Hidden Markov Model

Chapter 6

  Jan 26 Hidden Markov Model (II)

Lawrence R. Rabiner, 1989. A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77(2), pp. 257-286.

4 Jan 31 Language Model

Chapter 4

  Feb 2 Class cancelled (snow day)  
5 Feb 7 Language Model (II)

Chapter 4

S. Chen and J. Goodman, An Empirical Study of Smoothing Techniques for Language Modeling, 1998

S. Katz, Estimation of probabilities from sparse data for the language model component of a speech recogniser

  Feb 9 Context Free Grammar Chapter 12
6 Feb 14 Parsing

Chapter 13

Homework 1 Due, Homework 2 Assigned

  Feb 16 Parsing (2)   
7 Feb 21 Probabilistic Parsing (I)

Chapter 14

Michael Collins, Three Generative, Lexicalised Models for Statistical Parsing 

  Feb 23 Probabilistic Parsing (II) 

Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. ACL 2003, pp. 423-430.

8 Feb 28 Features and Unification

Chapter 15

  Mar 2 Meaning Representation 

Chapter 17

Homework 2 Due, Homework 3 Assigned

9 Mar 7 Spring Break, no class  
  Mar 9 Spring Break, no class  
10 Mar 14 Semantic Analysis

Chapter 18

  Mar 16 Lexical Semantics

Chapter 19, 20

 

11 Mar 21 Computational Lexical Semantics I

Chapter 2: Entropy, Relative Entropy and Mutual Information, Element of Information Theory, T.M. Cover and J. A. Thomas, 1991 John Wiley & Sons.

  Mar 23 Computational Lexical Semantics II

Daniel Gildea and Daniel Jurafsky Automatic Labeling of Semantic Roles. Computational Linguistices, 2002

M. Gerber and J. Chai. A Study of Implicit Arguments for Nominal Predicates.  ACL 2010.

Lluis Marquez. SRL Tutorial at ACL 2009

Homework 3 Due

12 Mar 28 Maximum Entropy Model

 

Chapter 6

Adam Berger et al. A Maximum Entropy Approach to Natural Language Processing

Final Project Proposal Due

  Mar 30 Computational Discourse I

Chapter 21

13 Apr 4 Computational Discourse II Vincent Ng. Supervised Noun Phrase Coreference Research: The First Fifteen Years. ACL 2010s
  Apr 6 Class Cancelled (instructor away)
14 Apr 11 Information Extraction and Question Answering Chapter 22, 23
  Apr 13 Expectation Maximization Algorithms  
15 Apr 18 Machine Translation I

Chapter 25

Papineni, K. et al., BLEU: A Method for Automatic Evaluation of Machine Translation, ACL 2002

  Apr 20 Machine Translation II

Chapter 25

P. Brown et al., The Mathematics of Statistical Machine Translation: Parameter EstimationComputational Linguistics, 19(2): 1993 

16 Apr 25 Final Project Presentation  
  Apr 27 Final Project Presentation Final Project Report Due (May 6, 11:59pm)

Acknowledgments:

Course materials made available by Dan Jurafsky, Michael Collins, and Jason Eisner helped preparation of this course.

Academic Honesty:

It is your responsibility to follow MSU's policy on academic integrity. Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment in which the copying or paraphrasing was done. Violation of academic integrity policy will result in a Grade F in the course. 

Alternative Testing:

Alternative testing is available to those with a documented disability affecting performance on tests. Students with documented disabilities requiring some form of accommodation receive a Verified Individualized Services and Accommodations (VISA) document which displays verified testing accommodations when appropriate. Please visit Alternative Testing Guidelines if applied. 

Notes: The instructor reserves the right to modify course policies and the course calendar according to the progress and needs of the class.