COMP 3550: Introduction to Bioinformatics

This course is an elective for the Data-centric Computing concentration.

This course is designed as an interdisciplinary introductory course in bioinformatics for both Computer Science and Biology students and as a bridge between both disciplines. The course is intended to be a course for a mixed audience of students with different backgrounds (e.g. computer science and biology). The course will focus on the fundamental concepts, ideas and related biological applications of existing bioinformatics tools. The purpose is to provide the students with hands-on experience on the major computational approaches applied to a wide variety of bioinformatics problems.

Prerequisites: Biology 1001; one of COMP 1001, COMP 1002 or COMP 1510; and 6 credit hours in Computer Science or Biology course at the 2000 level or above, excluding Biology 2040, 2041, 2120; or permission of the course instructor

Availability: This course is usually offered once per year, in Fall or Winter.

Course Objectives

Bioinformatics deals with the development and application of computational methods to address biological problems. The course will focus on the fundamental concepts, ideas and related biological applications of existing bioinformatics tools. This course will provide hands-on experience in applying bioinformatics software tools and online databases to analyze experimental biological data, and it will also introduce scripting language tools typically used to automate some biological data analysis tasks.

Biology students will appreciate the impact of these approaches for addressing biological questions and will gain insight on the limitations and strengths of these approaches. Computer Science students will appreciate the practical use of the concepts they have been taught in other courses, but most importantly, the challenges posed by biological questions, and the need for the robust algorithms that deal with the very large, noisy datasets typically present in biology. Computer scientists and biologists will both recognize the large diversity of questions addressed by bioinformatics applications. Many industry and research jobs now require cross-disciplinary collaboration. With this course, students will start becoming aware of the interdisciplinary nature of bioinformatics and appreciate the contribution of people outside their field of study.

Representative Workload

Assignments and Projects 25%
Lab Work and Quizzes 20%
In-class Exam 25%
Final Exam 30%

Representative Course Outline

Introduction

What is Bioinformatics?
Why is Bioinformatics required?
Importance of interdisciplinary collaboration

Sequences

Why compare sequences?
Sequence similarity
Where to look for information about a sequence
Sequence alignment: Pairwise and multiple

Genomics

How are genomes sequenced?
How are genomes annotated?
Genomic variation
Gene expression

How is gene expression measured?
Pre-processing the data: denoising and normalization
Differential analysis

Interpreting a list of genes

Gene functional annotation - Gene Ontology (GO)
Finding over-represented gene functions in gene lists
Other source of annotations

Gene function prediction

Proteomics

Protein Interaction Networks
Protein Domains
How are proteins measured and identified?

Transcriptomics

Motif finding
Determining binding preferences
Inferring regulatory networks

Metabolomics

Detection and identification of metabolites
Human metabolome project

Labs

Students will be expected to attend a weekly lab session, and to submit a lab report or to answer a lab quiz at the end of each lab.

Script programming and using bioinformatics libraries (BioPerl)
Sequences

Using BLAST, BLAT
Using alignment tools (such as: ProbCons, M-Coffee)

Working with sequenced genomes

Ensembl, BioMart, UCSC Genome Browser
Linking own data to a Genome browser

Analysis of gene expression data using existing tools (such as: Babelomics, GeneXPress, Gene Pattern)
Annotating a list of genes with functional annotation
Using over-representation or enrichment analysis tools (such as: GSEA, DAVID, GenMAPP, GOMiner)
Using gene function prediction eystems (such as: GeneMANIA, FuncBase, NBrowse, STRING, FunCoup)
Using motif finding tools in a set of sequences (such as: MEME, AlignACE)
Using regulatory networks prediction systems (such as: COALESCE, Allegro)

Notes

Credit cannot be obtained for both Computer Science 3550 and Biology 3951.

Page last updated May 24th 2021

Computer Science
|
Faculty of Science