601.465/665 - Fall 2019
This course is an in-depth overview of techniques for processing human language. How should linguistic structure and meaning be represented? What algorithms can recover them from text? And crucially, how can we build statistical models to choose among the many legal answers? The course covers methods for trees (parsing and semantic interpretation), sequences (finite-state transduction such as morphology), and words (sense and phrase induction). There are a number of structured but challenging programming assignments. (Prerequisite: 600/601.226 data structures)
Textbook: Jurafsky & Martin, Speech and Language Processing, 2nd ed. (P98.J87 2009 in Science Ref section on C-Level)
The course is organized into 11 thematic modules and 5 research talks. The thematic modules focus on core concepts and methods needed to become well-rounded in NLP: Modeling Grammaticality, Language Models, Text Classification, Linguistics, Tree Parsing, Neural Networks, Sequence Tagging, Topic Models, Finite State Transducers, Semantics, Structured Prediction. The research talks are guest lectures that are intended to give students a taste of the exciting world of NLP research.
Natural Language Processing (NLP) is an exciting field! This course is designed to introduce you to some of the problems and solutions of NLP, and their relation to linguistics and statistics. You need to know how to program and use common data structures. At the end, we hope that you will acquire a fascination for the intricacies of human language, and feel ownership over some of NLP’s core formal and statistical techniques, to the extent that you can begin to understand research papers in the field.
|Grading||6 homework assignments 60%, participation 5%, midterm 15%, final 20%|
|Late homework policy||It’s important to get homeworks done on time so that you can follow the subsequent lectures. We understand that emergencies do occur, so you are allowed up to 10 late days throughout the term. They are only intended to cover situations where you would ordinarily ask for an extension. Rather than ask me, just use a late day: I don’t want to be in the position of deciding which excuses are worthy and whose aren’t. If you run out of late days, we’ll have to give you zeroes. But it is still to your advantage to turn in all homeworks to get feedback.|
|Readings||Readings are announced on the course web site for each module. Students are expected to read the material before or after class.|
|Honesty||CS integrity code, JHU undergraduate policies, JHU graduate policies|
|Disabilities||If you need accommodations for a disability, obtain a letter from Student Disability Services, 385 Garland, (410) 516-4720.|
This course is modeled after Jason Eisner’s NLP course and heavily borrows its materials.