Computer Models of Language Representation and Processing
Lecturer:
Prof. Daniel R. Schlegel, 395 Shineman Center, daniel.schlegel@oswego.edu
Office/Lab hours: T: 8-9am, W: 8-9am, Th: 12:15-1:15pm, F: 9-10am, and by appointment
Section 800: TTh 9:35-10:55am, Shineman 176
Course Description:
This course seeks to establish a foundational framework for discussion of computational natural language processing. The topics that will be treated here are grounded in theories of knowledge representation and reasoning with particular reference to computational semantics and pragmatics. Emphasis will be placed primarily on symbolic systems, with some brief attention to connectionist and statistical approaches. Finally, some attention will be paid to criticism of approaches to natural language processing.
Course Objectives:
Upon successful completion of this course, students will:
have an understanding of computational approaches to working with natural language text;
be able to make use of modern tools and techniques to perform basic natural language text processing tasks;
have the foundation upon which they can build to solve more sophisticated problems involving natural language text.
Textbooks:
Jurafsky, D. and Martin, J.H., Speech and Language Processing, 2e. Prentice Hall, 2008.
Jurafsky, D. and Martin, J.H., Speech and Language Processing, 3e Draft Chapters, 2019.
Useful Resources:
Python Cheatsheets: 1, 2
Online Regular Expression Debugger – Regex101
Stanford CoreNLP Demo
NLTK Book
Festival Speech Synthesis System
General Architecture for Text Engineering (GATE)
Universal Dependencies
Unified Verb Index
Attendance and Participation:
As per college policy, attendance in all sessions is obligatory. If you cannot attend a class meeting due to religious, athletic, health related circumstance, or circumstance of particular hardship, please notify me in advance via email. Please be ready to present proof, if necessary. It is expected that each person actively engage in each class session.
Classroom Etiquette:
A positive learning environment relies upon creating an atmosphere where all students feel welcome. Classroom discussion is meant to allow us to hear a variety of viewpoints. This can only happen if we respect each other and our differences. Hostility and disrespectful behavior is not acceptable.
Cell phones and headphones should not be out or used during lecture, and laptops should only be used for taking notes (I don’t recommend this). If use of any electronics becomes distracting to other students I reserve the right to discontinue the allowance of their use.
Grading:
Grades will be calculated based on a reading response journal, assignments, projects, and exams as described below.
Assignments | 20% |
Reading Response Journal | 10% |
Final Project | 30% |
Midterm Exam | 15% |
Final Exam | 25% |
The default grading for the course will be along the university’s standard grading curve:
A: 93-100 | C+: 77-79 |
A-: 90-92 | C: 73-76 |
B+: 87-89 | C-: 70-72 |
B: 83-86 | D+: 67-69 |
B-: 80-82 | D: 60-66 |
E: 0-59 |
A more generous curve may be used, but should not be expected.
Assignments:
There will be responses to readings, 3-4 assignments, and a final project. The assignments may (but do not have to) be completed with a partner of your choosing. It is a good idea for a partnership to have two people of different specialties or interests, for example a linguist and a computer scientist. The responses should be done alone. The final project will be completed alone and will explore a topic of your choosing within the scope of this course. Further details about all assignments will be made available as they are assigned.
Assignments, reading responses, and projects will be posted on the class Scalar site and graded according to the quality of solution, including completeness and correctness. Written assignments will additionally be graded according to their quality as communicative artifacts. Quality of presentation will be incorporated into assignment grades for those which are presented in class.
Exams:
There will be two exams during the semester: the first during week 7, and the second during finals week.
Each exam question will be assigned a point value (generally some multiple of 3 depending on difficulty), where the following scheme will be used in grading it:
0 – Did not attempt / No serious attempt
1 – Mostly incorrect solution
2 – Somewhat incorrect solution
3 – Perfect solution
If the problem is a multiple of 3, then intermediate scores will be given as appropriate. The total points received on all questions will then be summed and divided by the points possible and scaled as appropriate according to the percentages given above.
Schedule/Outline:
This syllabus and the course schedule are subject to change by the instructor. All changes and related justifications will be announced in class, and updates will be reflected in this web version.
Lecture slides will be maintained on Blackboard, but many lectures will include use of the whiteboard which may not be reflected in notes elsewhere.
During the semester we aim to cover the following topics:
Week | Day | Date | |
---|---|---|---|
1 | Tuesday | 1/29 | First day of class Syllabus; Course Overview Working with Text Intro Readings: SLP Chapter 1 Create a Scalar account |
Thursday | 1/31 | Snow Day! | |
2 | Tuesday | 2/5 | Research Overview Introduction to Python using PythonTurtle On your own: Experiment with PythonTurtle. Do Levels 1 and 2 from the Help menu. For a challenge, try to draw a house or a star. |
Wednesday | 2/6 | Add deadline | |
Thursday | 2/7 | Python, continued: Loops, functions, selection, and lists On your own: Do levels 3 and 4 from the PythonTurtle Help menu. For a challenge, try to draw the American flag. |
|
3 | Tuesday | 2/12 | Introduction to Regular Expressions On your own: Practice Regular Expressions Guide to Writing and Assessing Journal Reflections Journal Response 1, due 2/26, 11:59pm on Weizenbaum's 1966 ELIAZA paper |
Thursday | 2/14 | Regular Expressions, concluded. A Python Chatbot Example Assignment 1, to be demoed in class 2/28. Please post your code on your Scalar page. Reading: SLP Sections 3.1, 3.8 Optional Reading: Christiansen & Amon, More Than Words: The Role of Multiword Sequences in Language Learning and Use |
|
Friday | 2/15 | Drop deadline |
|
4 | Tuesday | 2/19 | n-gram Language Models Reading: SLP3, Chapter 7 On Your Own: Read and follow along with Chapter 1 of the NLTK Book |
Thursday | 2/21 | Word Meaning, Vector Semantics | |
5 | Tuesday | 2/26 | Vector Semantics, concluded. Neural Networks and Neural Language Models Journal Reflection 2 due 3/15, 11:59PM on Scalar |
Thursday | 2/28 | Assignment 1 Demos Assignment 2 due 3/13, 11:59PM on Scalar; to be discussed in class 3/14 |
|
6 | Tuesday | 3/5 | Dan Sick 🙁 |
Thursday | 3/7 | Neural Language Models, concluded. Normalization Tools for NLP |
|
7 | Tuesday | 3/12 | Midterm Exam |
Thursday | 3/14 | Assignment 2 Discussion Text Processing Tools Final Project Description Proposals due: 3/31; Papers/projects due 5/5; Presentations 5/7 and 5/9 Over break: Look through the NLTK book, the NLTK HOWTOs, and the GATE training materials as you consider project ideas. |
|
8 | Tuesday | 3/19 | No Class - Spring Break |
Thursday | 3/21 | No Class - Spring Break | |
9 | Tuesday | 3/26 | Introduction to NLP Toolkits Journal Reflection 3 due 3/31 on Scalar |
Thursday | 3/28 | Guest Lecture: Dr. John K. Lindstedt (Rice University) - "Modeling a “mind”: Understanding computational cognitive modeling with the ACT-R cognitive architecture" | |
10 | Tuesday | 4/2 | Training Models in NLTK and Sentiment Analysis |
Thursday | 4/4 | In-Class Project Planning! | |
Friday | 4/5 | Withdraw deadline | |
11 | Tuesday | 4/9 | Grammars and Parsing Readings: SLP 3rd Edition, Chapter 11 through end of 11.1; Chapter 14 through end of 14.3 Optional Reading: Chomsky, N. "Three Models for the Description of Language" 1956 |
Thursday | 4/11 | Logic and Semantics Relevant Talk: Dr. Stuart C. Shapiro (my PhD advisor!) - Science Today - 4PM in 176 - "The Design of Cognitive Agents" Reading: Peter Suber's Translation Tips (Propositional Logic Section) |
|
12 | Tuesday | 4/16 | Propositional Logic Model Finding Reading: Peter Suber's Translation Tips (Predicate Logic Section) |
Thursday | 4/18 | Predicate Logic | |
13 | Tuesday | 4/23 | Logic, concluded Readings: Frames and Case Grammar sections from Shapiro, S.C., ed. Encyclopedia of Artificial Intelligence, 2nd ed. (on Blackboard) Optional Readings: Barwise, J. & Cooper, R. Generalized Quantifiers and Natural Language (on Blackboard) |
Thursday | 4/25 | Frames & Case Grammar In-Class Project Progress Report Reading: SLP 3rd edition 24.2 (frame-based dialog systems); Chapter 25 (advanced dialog systems) |
|
14 | Tuesday | 4/30 | Graphs for NLU Optional Reading: Kamp, Hans "A Theory of Truth and Semantic Representation" (on Blackboard) |
Thursday | 5/2 | Discourse+Dialog Have final project paper on Scalar by Monday morning; email me the link to the page. Presentations and code can be submitted on Blackboard. Submit before you present! Plan on 5-7 minute presentations. If we have time for questions we'll do that, but we may not. Extra Credit Assignment due 5/17, 4:30pm on Blackboard |
|
15 | Tuesday | 5/7 | Project Presentations |
Thursday | 5/9 | Last day of class Project Presentations Final Exam Study Guide |
|
Finals Week | Thursday | 5/16 | Final Exam 8-10am |
Academic Integrity:
While it is acceptable to discuss general approaches with your fellow students, the work you turn in must be your own. You may not turn in code found on the internet. If you have any problems doing the assignments, consult the instructor. Please be sure to read the webpage, “Academic Integrity“, which spells out all the details of this, and related policies. See my page on plagiarism for an explanation of what I consider cheating.
Accessibility:
If you have a disabling condition which may interfere with your ability to successfully complete this course, please contact Accessibility Resources located at 155 Marano Campus Center, phone 315.312.3358, access@oswego.edu