Computer Models of Language Representation and Processing
Prof. Daniel R. Schlegel, 395 Shineman Center, firstname.lastname@example.org
Office/Lab hours: Thursday 9:30-11:30am; Friday 12:30-1:30pm; by appointment
Section 800: MWF 1:50-2:45pm, Shineman 175
This course seeks to establish a foundational framework for discussion of computational natural language processing. The topics that will be treated here are grounded in theories of knowledge representation and reasoning with particular reference to computational semantics and pragmatics. Emphasis will be placed primarily on symbolic systems, with some brief attention to connectionist and statistical approaches. Finally, some attention will be paid to criticism of approaches to natural language processing.
Upon successful completion of this course, students will:
have an understanding of computational approaches to working with natural language text;
be able to make use of modern tools and techniques to perform basic natural language text processing tasks;
have the foundation upon which they can build to solve more sophisticated problems involving natural language text.
Jurafsky, D. and Martin, J.H., Speech and Language Processing, 2e. Prentice Hall, 2008.
Speech and Language Processing, 3rd edition draft chapters
Online Regular Expression Debugger – Regex101
Stanford CoreNLP Demo
Festival Speech Synthesis System
General Architecture for Text Engineering (GATE)
Unified Verb Index
Attendance Policy and Classroom Etiquette:
As per college policy, attendance in all sessions is obligatory. If you cannot attend a class meeting due to religious, athletic, health related circumstance, or circumstance of particular hardship, please notify me in advance via email. Please be ready to present proof, if necessary. Cell phones and headphones should not be out or used during lecture, and laptops should only be used for taking notes (I don’t recommend this). If use of any electronics becomes districting to other students I reserve the right to discontinue the allowance of their use.
There will be 3-4 assignments and a final project. The assignments may (but do not have to) be completed with a partner of your choosing. It is a good idea for a partnership to have two people of different specialties or interests, for example a linguist and a computer scientist. The final project will be completed alone and will explore a topic of your choosing within the scope of this course. Further details about all assignments will be made available as they are assigned.
Assignments will be submitted on blackboard and graded according to the quality of solution, including completeness and correctness. Written assignments will additionally be graded according to their quality as communicative artifacts. Quality of presentation will be incorporated into assignment grades for those which are presented in class.
It is expected that each person participate during each class. As discussed above, attendance is required.
Each exam question will be assigned a point value (generally some multiple of 3 depending on difficulty), where the following scheme will be used in grading it:
0 – Did not attempt / No serious attempt
1 – Mostly incorrect solution
2 – Somewhat incorrect solution
3 – Perfect solution
If the problem is a multiple of 3, then intermediate scores will be given as appropriate. The total points received on all questions will then be summed and divided by the points possible and scaled as appropriate according to the percentages given below.
The default grading for the course will be along the university’s standard grading curve:
|A: 93-100||C+: 77-79|
|A-: 90-92||C: 73-76|
|B+: 87-89||C-: 70-72|
|B: 83-86||D+: 67-69|
|B-: 80-82||D: 60-66|
A more generous curve may be used, but should not be expected.
During the semester we aim to cover the following topics:
This syllabus and the course schedule are subject to change by the instructor. All changes and related justifications will be announced in class, and updates will be reflected in this web version.
Lecture slides will be maintained on Blackboard, but many lectures will include use of the whiteboard which may not be reflected in notes elsewhere.
|1||Monday||1/22||First day of class
Syllabus; Course Overview
Working with Text Intro
Readings: SLP Chapter 1
|Wednesday||1/24||Finish Working with Text Intro
Research Overview (for context)
Readings: Begin looking at SLP Section 2.1
Optional Readings:This Is Watson (on Blackboard)
Readings: Finish SLP Section 2.1; play with sample regular expressions on regex101.
Optional Readings: Weizenbaum's ELIZA Paper
|2||Monday||1/29||Regular Expressions (concluded)
Readings: SLP 3rd Edition Draft Sections 2.2-2.3
Writing your own chatbot
Sample Python Chatbot
Tokenization and Sentence Splitting
Assignment 1 due 2/11, 11:59pm on Blackboard, demoed in class 2/12
Readings: SLP Chapter 3 through the end of section 3.1; Section 3.8
Optional Readings: Christiansen & Amon, More Than Words: The Role of Multiword Sequences in Language Learning and Use
Reading: SLP 3rd ed, Ch 4 through end of 4.1 (don't worry too much about the math!)
Optional Reading: M.F. Porter, An Algorithm for Suffix Stripping
|3||Monday||2/5||Language Models; N-grams
Readings: SLP 3rd Ed, Chapter 8 (again, don't get bogged down by the math!)
Optional Readings: Johns and Jamieson, A Large-scale Analysis of Variance in Written Language, 2018; Dye, M., et al. Alternative Solutions to a Language Design Problem: The Role of Adjectives and Gender Marking in Efficient Communication. Topics in Cognitive Science, 2017 (on Blackboard)
N-grams continued; Neural Networks
|4||Monday||2/12||Assignment 1 in-class demos|
|Wednesday||2/14||Neural Language Models Concluded
Finite Automata Introduction
Assignment 2 due
Readings: Sections 2.2-2.4
|Friday||2/16||Finite State Transducers
Readings: Section 3.2 through the end of 3.6
Readings: Chapter 5
|Wednesday||2/21||Part of Speech Tagging
Hidden Markov Models
|6||Monday||2/26||Guest lecture: Dr. Jonathan Bona|
Readings: SLP 3rd Edition, Chapter 17 through the end of 17.4
|Friday||3/2||POS Tagging Concluded;
|7||Monday||3/5||Word Senses; WordNet
Readings: SLP 3rd Edition, Chapter 12 through end of 12.1; Chapter 14 through end of 14.3
Optional Readings: Chomsky, N. "Three Models for the Description of Language" 1956 (on Blackboard)
|Wednesday||3/7||Phrase Structure and Parsing Text|
|Friday||3/9||Parsing and Semantic Role Labeling
Assignment 3 due
|8||Monday||3/12||No Class - Spring Break|
|Wednesday||3/14||No Class - Spring Break|
|Friday||3/16||No Class - Spring Break|
|9||Monday||3/19||Tools for Natural Language Processing - GATE
Readings: Read and follow along with Chapter 1 of the NLTK Book using Thonny or repl.it
Final Project Description
|Wednesday||3/21||Tools for Natural Language Processing - NLTK
Trace of repl.it from class
|Friday||3/23||Training Machine Learning Models in NLTK
Readings: Peter Suber's Translation Tips (propositional logic section)
Translation of English sentences to Logic
Project Proposals Due on Blackboard, 11:59pm
|Friday||3/30||No Class - Easter Weekend|
Readings: Peter Suber's Translation Tips (predicate logic sections)
|Wednesday||4/4||No Class - Quest Day|
|Wednesday||4/11||Predicate Logic, concluded
|Friday||4/13||Non-Classical Logics, concluded
Readings: Frames and Case Grammar sections from Shapiro, S.C., ed. Encyclopedia of Artificial Intelligence, 2nd ed. (on Blackboard)
|13||Monday||4/16||Frames & Case Grammar|
|Wednesday||4/18||Project Progress Discussion|
|Friday||4/20||Graphs for Language Understanding
Readings: SLP 3rd Edition, Section 29.2, Chapter 30
|14||Monday||4/23||A Return to Dialog and Discourse
Extra Credit Assignment due 5/11, 11:59pm on Blackboard
|Friday||4/27||Speech Recognition / Synthesis|
|Friday||5/4||Last day of class
Final Exam Study Guide
Final Project Papers/Code Due
|Finals Week||Monday||5/7||Final Exam 2-4pm, 175 Shineman|
While it is acceptable to discuss general approaches with your fellow students, the work you turn in must be your own. You may not turn in code found on the internet. If you have any problems doing the assignments, consult the instructor. Please be sure to read the webpage, “Academic Integrity“, which spells out all the details of this, and related policies. See my page on plagiarism for an explanation of what I consider cheating.
If you have a disabling condition, which may interfere with your ability to successfully complete this course, please contact the Office of Disability Services at email@example.com and x3358.