Hi There!

I'm Dan Schlegel, an Associate Professor in the Computer Science Department at SUNY Oswego

COG376 Spring 2018

Computer Models of Language Representation and Processing

Lecturer:

Prof. Daniel R. Schlegel, 395 Shineman Center, daniel.schlegel@oswego.edu
Office/Lab hours: Thursday 9:30-11:30am; Friday 12:30-1:30pm; by appointment
Section 800: MWF 1:50-2:45pm, Shineman 175

Course Description:

This course seeks to establish a foundational framework for discussion of computational natural language processing. The topics that will be treated here are grounded in theories of knowledge representation and reasoning with particular reference to computational semantics and pragmatics. Emphasis will be placed primarily on symbolic systems, with some brief attention to connectionist and statistical approaches. Finally, some attention will be paid to criticism of approaches to natural language processing.

Course Objectives:

Upon successful completion of this course, students will:

have an understanding of computational approaches to working with natural language text;
be able to make use of modern tools and techniques to perform basic natural language text processing tasks;
have the foundation upon which they can build to solve more sophisticated problems involving natural language text.

Textbooks:

Jurafsky, D. and Martin, J.H., Speech and Language Processing, 2e. Prentice Hall, 2008.

Useful Resources:

Speech and Language Processing, 3rd edition draft chapters
Online Regular Expression Debugger – Regex101
Stanford CoreNLP Demo
Festival Speech Synthesis System
General Architecture for Text Engineering (GATE)
Universal Dependencies
Unified Verb Index

Attendance Policy and Classroom Etiquette:

As per college policy, attendance in all sessions is obligatory. If you cannot attend a class meeting due to religious, athletic, health related circumstance, or circumstance of particular hardship, please notify me in advance via email. Please be ready to present proof, if necessary. Cell phones and headphones should not be out or used during lecture, and laptops should only be used for taking notes (I don’t recommend this). If use of any electronics becomes districting to other students I reserve the right to discontinue the allowance of their use.

Assignments:

There will be 3-4 assignments and a final project. The assignments may (but do not have to) be completed with a partner of your choosing. It is a good idea for a partnership to have two people of different specialties or interests, for example a linguist and a computer scientist. The final project will be completed alone and will explore a topic of your choosing within the scope of this course. Further details about all assignments will be made available as they are assigned.

Grading:

Assignments will be submitted on blackboard and graded according to the quality of solution, including completeness and correctness. Written assignments will additionally be graded according to their quality as communicative artifacts. Quality of presentation will be incorporated into assignment grades for those which are presented in class.

It is expected that each person participate during each class. As discussed above, attendance is required.

Each exam question will be assigned a point value (generally some multiple of 3 depending on difficulty), where the following scheme will be used in grading it:

0 – Did not attempt / No serious attempt
1 – Mostly incorrect solution
2 – Somewhat incorrect solution
3 – Perfect solution

If the problem is a multiple of 3, then intermediate scores will be given as appropriate. The total points received on all questions will then be summed and divided by the points possible and scaled as appropriate according to the percentages given below.

Assignments20%
Final Project30%
Exam 115%
Exam 215%
Final Exam20%

The default grading for the course will be along the university’s standard grading curve:

A: 93-100C+: 77-79
A-: 90-92C: 73-76
B+: 87-89C-: 70-72
B: 83-86D+: 67-69
B-: 80-82D: 60-66
 E: 0-59

A more generous curve may be used, but should not be expected.

Schedule/Outline:

During the semester we aim to cover the following topics:

This syllabus and the course schedule are subject to change by the instructor. All changes and related justifications will be announced in class, and updates will be reflected in this web version.

Lecture slides will be maintained on Blackboard, but many lectures will include use of the whiteboard which may not be reflected in notes elsewhere.

WeekDayDate
1Monday1/22First day of class
Syllabus; Course Overview
Working with Text Intro
Readings: SLP Chapter 1
Be sure to answer office hours survey!
Wednesday1/24Finish Working with Text Intro
Research Overview (for context)
Readings: Begin looking at SLP Section 2.1
Optional Readings:This Is Watson (on Blackboard)
Friday1/26Regular Expressions
Readings: Finish SLP Section 2.1; play with sample regular expressions on regex101.
Optional Readings: Weizenbaum's ELIZA Paper
2Monday1/29Regular Expressions (concluded)
Readings: SLP 3rd Edition Draft Sections 2.2-2.3
Wednesday1/31Add deadline
Writing your own chatbot
Sample Python Chatbot
Tokenization and Sentence Splitting
Assignment 1 due 2/11, 11:59pm on Blackboard, demoed in class 2/12
Readings: SLP Chapter 3 through the end of section 3.1; Section 3.8
Optional Readings: Christiansen & Amon, More Than Words: The Role of Multiword Sequences in Language Learning and Use
Friday2/2Text Normalization
Reading: SLP 3rd ed, Ch 4 through end of 4.1 (don't worry too much about the math!)
Optional Reading: M.F. Porter, An Algorithm for Suffix Stripping
3Monday2/5Language Models; N-grams
Readings: SLP 3rd Ed, Chapter 8 (again, don't get bogged down by the math!)
Optional Readings: Johns and Jamieson, A Large-scale Analysis of Variance in Written Language, 2018; Dye, M., et al. Alternative Solutions to a Language Design Problem: The Role of Adjectives and Gender Marking in Efficient Communication. Topics in Cognitive Science, 2017 (on Blackboard)
Wednesday2/7Snow Day!
Friday2/9Drop deadline
N-grams continued; Neural Networks
4Monday2/12Assignment 1 in-class demos
Wednesday2/14Neural Language Models Concluded
Finite Automata Introduction
Assignment 2 due 2/26 2/28 3/1, 11:59PM Blackboard
Readings: Sections 2.2-2.4
Friday2/16Finite State Transducers
Morphological Analysis
Readings: Section 3.2 through the end of 3.6
5Monday2/19Class Cancelled
Readings: Chapter 5
Wednesday2/21Part of Speech Tagging
Hidden Markov Models
Friday2/23Class Cancelled
6Monday2/26Guest lecture: Dr. Jonathan Bona
Wednesday2/28Exam 1
Readings: SLP 3rd Edition, Chapter 17 through the end of 17.4
Friday3/2POS Tagging Concluded;
Word Senses
7Monday3/5Word Senses; WordNet
Readings: SLP 3rd Edition, Chapter 12 through end of 12.1; Chapter 14 through end of 14.3
Optional Readings: Chomsky, N. "Three Models for the Description of Language" 1956 (on Blackboard)
Wednesday3/7Phrase Structure and Parsing Text
Friday3/9Parsing and Semantic Role Labeling
Assignment 3 due 3/23 3/26, at the beginning of class
8Monday3/12No Class - Spring Break
Wednesday3/14No Class - Spring Break
Friday3/16No Class - Spring Break
9Monday3/19Tools for Natural Language Processing - GATE
Readings: Read and follow along with Chapter 1 of the NLTK Book using Thonny or repl.it
Final Project Description
Wednesday3/21Tools for Natural Language Processing - NLTK
Trace of repl.it from class
Friday3/23Training Machine Learning Models in NLTK
Sentiment Analysis
10Monday3/26Logic Introduction
Readings: Peter Suber's Translation Tips (propositional logic section)
Wednesday3/28Logic, continued
Translation of English sentences to Logic
Project Proposals Due on Blackboard, 11:59pm
Friday3/30No Class - Easter Weekend
11Monday4/2Model Finding
Readings: Peter Suber's Translation Tips (predicate logic sections)
Withdraw Deadline
Wednesday4/4No Class - Quest Day
Friday4/6Exam 2
12Monday4/9Predicate Logic
Wednesday4/11Predicate Logic, concluded
Non-Classical Logics
Friday4/13Non-Classical Logics, concluded
Frames
Readings: Frames and Case Grammar sections from Shapiro, S.C., ed. Encyclopedia of Artificial Intelligence, 2nd ed. (on Blackboard)
13Monday4/16Frames & Case Grammar
Wednesday4/18Project Progress Discussion
Friday4/20Graphs for Language Understanding
Readings: SLP 3rd Edition, Section 29.2, Chapter 30
14Monday4/23A Return to Dialog and Discourse
Extra Credit Assignment due 5/11, 11:59pm on Blackboard
Wednesday4/25No Class
Friday4/27Speech Recognition / Synthesis
15Monday4/30Project Presentations
Wednesday5/2Project Presentations
Friday5/4Last day of class
Final Exam Study Guide
Project Presentations
Final Project Papers/Code Due
Finals WeekMonday5/7Final Exam 2-4pm, 175 Shineman

Academic Integrity:

While it is acceptable to discuss general approaches with your fellow students, the work you turn in must be your own. You may not turn in code found on the internet. If you have any problems doing the assignments, consult the instructor. Please be sure to read the webpage, “Academic Integrity“, which spells out all the details of this, and related policies. See my page on plagiarism for an explanation of what I consider cheating.

Disability Statement:

If you have a disabling condition, which may interfere with your ability to successfully complete this course, please contact the Office of Disability Services at dss@oswego.edu and x3358.