CSE842: Natural Language Processing
|Location:||1260 Anthony Hall|
|Time:||Monday and Wednesday, 3:00-4:20pm|
|Professor:||Joyce Chai, 2138 Engineering Building, (517)432-9239, jchai AT cse DOT msu DOT edu|
|Office Hours:||Tuesday: 2:00-4:00pm, or by appointment|
The field of Natural Language Processing (NLP) is primarily concerned with computational models and computer algorithms to process human languages, for example, automatically interpret, generate, and learn natural language. In the past twenty years, the rise of the world wide web, mobile devices, and social media have created tremendous opportunities for exciting NLP applications. This course provides an introduction to the state of the art in modern NLP technologies. In particular, the topics to be discussed include: syntax, semantics, discourse, and their applications in information extraction, machine translation, and sentiment analysis. These topics will be examined through reading, discussion, and hands-on experience with NLP systems.
This course will be mainly focused on text-based language processing. Topics related to spoken language processing, dialogue, and language-based human agent communication are covered in CSE843: Language and Interaction.
Required: Speech and Language Processing, an introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, second edition, by Daniel Jurafsky and James Martin, Prentice Hall. ISBN-13: 978-0131873216.
We will also use draft chapters for the third edition: http://web.stanford.edu/~jurafsky/slp3/
Optional: Foundations of Statistical Natural Language Processing, by Christopher D. Manning and Hinrich Schutze. MIT Press. ISBN 0-262-13360-1
Paper discussion and presentation
Homework and Final Project:
The work in this course consists of three homework assignments (including both a written part and a programming part). The written portion must be turned in at the beginning of the lecture on the day it is due. The programming part is due at 11:59pm of the due date (submitted through handin facility). No late homework will be accepted. A set of topics will be provided for the final projects.
|Assigned date||Due date|
|Homework 1||Jan. 18||Feb. 8|
|Homework 2||Feb. 8||Feb. 27|
|Homework 3||Feb. 27||March 15|
|Project Proposal||-||March 22|
|Project Report||-||May 8|
Tentative Schedule of Topics
|1||Jan 9||Introduction and Basic Text Processing||
|Jan 11||Morphological Parsing
||Chapter 2 & 3|
|2||Jan 16||Martin Luther King Day, no class|
Homework 1 Assigned
|3||Jan 23||Classification and Sentiment Analysis
|Jan 25||Linear Regression, Logistic Regression, and Neural
|4||Jan 30||NN and Nueral Probabilistic Language Models||
Y. Bengio, R. Jucharme, P. Vincent, and C. Jauvin. A Neural Probabilistic Language Model. Journal of Machine Learning Research 3 (2003) 1137-1155.
|Feb 1||Hidden Markov Model||
Lawrence R. Rabiner, 1989. A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE 77(2), pp. 257-286.
|5||Feb 6||POS Tagging||
|Feb 8||Context Free Grammar||Homework 1 Due, Homework 2 Assigned |
Probabilistic Parsing &
Michael Collins, Three
Generative, Lexicalised Models for Statistical Parsing
Dan Klein and Christopher D. Manning. 2003. Accurate Unlexicalized Parsing. ACL 2003, pp. 423-430.
|7||Feb 20||Expectation Maximization||
|Feb 22||Meaning Representation||
|8||Feb 27||Semantic Analysis||
Homework 2 Due, Homework 3 Assigned
|Mar 1||Lexical Semantics||http://web.stanford.edu/~jurafsky/slp3/15.pdf|
|9||Mar 6||Spring Break, no class|
|Mar 8||Spring Break, no class|
|10||Mar 13||Computational Lexical Semantics||
Elements of Information Theory, Chapter 2, by Cover and Thomas
|Mar 15|| Distributional Semantics
Dense Vectors, Skip-grams
T. Mikolov, K. Chen, G. Corrado, J. Dean. Efficient Estimation of Word Representation in Vector Space
Homework 3 Due
|11||Mar 20||Semantic Role Labeling||Daniel Gildea and Daniel Jurafsky Automatic Labeling of Semantic Roles|
|Mar 22||Discourse Processing (1)||
Final Project Proposal Due
|12||Mar 27||Discourse Processing (2)|
|Mar 29||Information Extraction||
|13||Apr 3||Summarization and QA||
|Apr 5||Machine Translation I||
Papineni, K. et al., BLEU: A Method for Automatic Evaluation of Machine Translation, ACL 2002
|14||Apr 10||Machine Translation II||
P. Brown et al., The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19(2): 1993
|Apr 12||Recent Advances in NLP||Paper presentation and discussion (Session 1)|
|15||Apr 17||Recent Advances in NLP||Paper presentation and discussion (Session 2)|
|Apr 19||Recent Advances in NLP||
Paper presentation and discussion (Session 3)
|16||Apr 24||Final Project Presentation|
|Apr 26||Final Project Presentation|
||Final Project Report Due (May 5, 11:59pm)|
It is your responsibility to follow MSU's policy on academic integrity. Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment in which the copying or paraphrasing was done. Violation of academic integrity policy will result in a Grade F in the course.
Alternative testing is available to those with a documented disability affecting performance on tests. Students with documented disabilities requiring some form of accommodation receive a Verified Individualized Services and Accommodations (VISA) document which displays verified testing accommodations when appropriate. Please visit Alternative Testing Guidelines if applied.
Notes: The instructor reserves the right to modify course policies and the course calendar according to the progress and needs of the class.