CSE842: Natural Language Processing
Spring 2011
| Location: | EB3400 |
| Time: | Monday and Wednesday, 12:40-2:00pm |
| Professor: | Joyce Chai, 2138 Engineering Building, (517)432-9239, jchai AT cse DOT msu DOT edu |
| Office Hours: | Monday: 2:30-4:30pm, or by appointment |
Course Description:
The field of Natural Language Processing (NLP) is primarily concerned with getting computers to perform useful and interesting tasks with human languages, for example, automatically interpret, generate, and learn natural language. In the past twenty years, the rise of the world wide web, together with new advancements in language technologies, has created tremendous opportunities for exciting NLP applications. This course provides an introduction to the state of the art in modern NLP technologies. In particular, the issues to be discussed include: syntax, semantics, discourse, and their applications in information extraction, machine translation, and dialogue systems. These topics will be examined through reading, discussion, and hands-on experience with NLP systems.
Required: Speech
and Language Processing, an introduction to Natural Language Processing,
Computational Linguistics, and Speech Recognition, second edition, by Daniel Jurafsky and James Martin, Prentice Hall,
. ISBN-13: 978-0131873216. It should be available from the MSU bookstore, as well as
from Amazon and other online providers.
Optional:
1. Foundations of Statistical Natural Language Processing, by Christopher D. Manning and Hinrich Schutze. MIT Press. ISBN 0-262-13360-1
2. Semisupervised Learning for
Computational Linguistics, by Steven Abney. Chapman & Hall/CRC. ISBN
1-58488-559-9
All three books are reserved in the Engineering Library.
Course Grades:
| Class Participation and Quizzes | 5% |
|
Three Homework Assignments |
60% |
| Final Project | 35% |
Homework and Final Project:
The work in this course consists of three homework assignments (including both a written part and a programming part). The written portion must be turned in at the beginning of the lecture on the day it is due. The programming part is due at 11:59pm of the due date (submitted through handin facility). No late homework will be accepted. A set of topics will be provided for the final projects.
| Assigned date | Due date | |
| Homework 1 | Jan. 19 | Feb. 14 |
| Homework 2 | Feb. 14 | Mar. 2 |
| Homework 3 | Mar. 2 | March 23 |
| Project Proposal | - | March 28 |
| Project Report | - | May 8 |
Tentative Schedule of Topics
| Week | Class Date | Topic | Readings |
| 1 | Jan 10 | Introduction |
Chapter 1, 2 |
| Jan 12 | Morphological Parsing | Chapter 3 | |
| 2 | Jan 17 | Martin Luther King Day, no class | |
| Jan 19 | POS Tagging |
Chapter 5 Homework 1 Assigned | |
| 3 | Jan 24 | Hidden Markov Model |
Chapter 6 |
| Jan 26 | Hidden Markov Model (II) |
Lawrence R. Rabiner, 1989. A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77(2), pp. 257-286. | |
| 4 | Jan 31 | Language Model |
Chapter 4 |
| Feb 2 | Class cancelled (snow day) | ||
| 5 | Feb 7 | Language Model (II) |
Chapter 4 S. Chen and J. Goodman, An Empirical Study of Smoothing Techniques for Language Modeling, 1998 S. Katz, Estimation of probabilities from sparse data for the language model component of a speech recogniser. |
| Feb 9 | Context Free Grammar | Chapter 12 | |
| 6 | Feb 14 | Parsing |
Chapter 13 Homework 1 Due, Homework 2 Assigned |
| Feb 16 | Parsing (2) | ||
| 7 | Feb 21 | Probabilistic Parsing (I) |
Chapter 14 Michael Collins, Three Generative, Lexicalised Models for Statistical Parsing |
| Feb 23 | Probabilistic Parsing (II) |
Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. ACL 2003, pp. 423-430. | |
| 8 | Feb 28 | Features and Unification |
Chapter 15 |
| Mar 2 | Meaning Representation |
Chapter 17 Homework 2 Due, Homework 3 Assigned | |
| 9 | Mar 7 | Spring Break, no class | |
| Mar 9 | Spring Break, no class | ||
| 10 | Mar 14 | Semantic Analysis |
Chapter 18 |
| Mar 16 | Lexical Semantics |
Chapter 19, 20
| |
| 11 | Mar 21 | Computational Lexical Semantics I |
Chapter 2: Entropy, Relative Entropy and Mutual Information, Element of Information Theory, T.M. Cover and J. A. Thomas, 1991 John Wiley & Sons. |
| Mar 23 | Computational Lexical Semantics II |
Daniel Gildea and Daniel Jurafsky Automatic Labeling of Semantic Roles. Computational Linguistices, 2002 M. Gerber and J. Chai. A Study of Implicit Arguments for Nominal Predicates. ACL 2010. Lluis Marquez. SRL Tutorial at ACL 2009 Homework 3 Due | |
| 12 | Mar 28 |
Maximum Entropy Model
|
Chapter 6 Adam Berger et al. A Maximum Entropy Approach to Natural Language Processing Final Project Proposal Due |
| Mar 30 | Computational Discourse I |
Chapter 21 | |
| 13 | Apr 4 | Computational Discourse II | Vincent Ng. Supervised Noun Phrase Coreference Research: The First Fifteen Years. ACL 2010s |
| Apr 6 | Class Cancelled | (instructor away) | |
| 14 | Apr 11 | Information Extraction and Question Answering | Chapter 22, 23 |
| Apr 13 | Expectation Maximization Algorithms | ||
| 15 | Apr 18 | Machine Translation I |
Chapter 25 Papineni, K. et al., BLEU: A Method for Automatic Evaluation of Machine Translation, ACL 2002 |
| Apr 20 | Machine Translation II |
Chapter 25 P. Brown et al., The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2): 1993 | |
| 16 | Apr 25 | Final Project Presentation | |
| Apr 27 | Final Project Presentation | Final Project Report Due (May 6, 11:59pm) |
Acknowledgments:
Course materials made available by Dan Jurafsky, Michael Collins, and Jason Eisner helped preparation of this course.
Academic Honesty:
It is your responsibility to follow MSU's policy on academic integrity. Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment in which the copying or paraphrasing was done. Violation of academic integrity policy will result in a Grade F in the course.
Alternative Testing:
Alternative testing is available to those with a documented disability affecting performance on tests. Students with documented disabilities requiring some form of accommodation receive a Verified Individualized Services and Accommodations (VISA) document which displays verified testing accommodations when appropriate. Please visit Alternative Testing Guidelines if applied.
Notes: The instructor reserves the right to modify course policies and the course calendar according to the progress and needs of the class.