COMP 4750: Introduction to Natural Language Processing

This course is an elective for the Smart Systems Stream, and for the Data-centric Computing Stream.

The ever-increasing number of people communicating with computer applications (either via stand-alone devices or over the internet) has led to a corresponding ever-increasing demand that this communication be carried out via natural human languages, either as written text or speech. In this course, an introduction will be given to Natural Language Processing (NLP), including an integrated systematic examination of the full range of rule-based and statistical techniques used in NLP.

Prerequisites:  COMP 3600 or the former COMP 3719

Availability: This course is occasionally offered, but will not be available every academic year.

Course Objectives

This course covers tasks involving human languages, such as speech recognition, text understanding, and keyword-based information retrieval which underlie many modern computing applications and their interfaces. To be truly useful, such natural language processing must be both efficient and robust. This course will give an introduction to the algorithms and data structures used to solve key NLP tasks, including utterance understanding and generation and language acquisition, in both of the major algorithmic paradigms used today (rule-based and statistical). The emphasis will be primarily on text-based processing though speech-based processing will be addressed where possible.

Representative Workload
  • Assignments 20%
  • In-class Exams (2) 40%
  • In-class Presentations (2) 10%
  • Course Project 30%
Representative Course Outline
  • Overview of Natural Language Processing (3 hours)
  • Background: Linguistics and Language Processing (7 hours)
    • Overview of classical linguistics
    • Representations of natural language utterances, grammars, and lexicons
    • Implementation of processes on natural language representations
  • Utterance comprehension (6 hours)
  • Utterance production (3 hours)
  • Language acquisition (3 hours)
  • Special applications (4 hours)
    • Such as language-language translation, question answering, text mining
  • Student presentations (5 hours)

Page last updated May 24th 2021