Carnegie Mellon University

Great Ideas in Computational Biology

Course Number: 02-251

This 12-unit course provides an introduction to many of the great ideas that have formed the foundation for the transformation of the life sciences into a fully-fledged computational discipline. This gateway course is intended as a first exposure to computational biology for first-year undergraduates in the School of Computer Science, although it is open to other quantitatively and computationally capable students who are interested in exploring the field. By completing this course, students will encounter a handful of fundamental algorithmic approaches deriving straight from very widely cited primary literature, much of which has been published in recent years. The course also introduces basic concepts in statistics, mathematics, and machine learning necessary to understand these approaches. Many of the ideas central to modern computational biology have resulted in widely used software that is applied to analyze (often very large) biological datasets; an important feature of the course is that students will be exposed to this software in the context of compelling biological problems.

 Key Topics

  • Genome assembly
  • Sequence alignment
  • Read mapping
  • Hidden Markov models for sequence alignment
  • Motif finding and expectation maximization
  • Metagenomics
  • Phylogenetics
  • Variant detection and population genetics
  • Machine learning and automated science
  • Protein structure prediction and proteomics
  • Network biology
  • RNA sequencing
  • Neural network theory
  • Evolutionary game theory
  • Algorithms in nature
  • Biological image analysis

Course Website

Semester(s): Spring
Units: 12
Prerequisite(s): (15112 or 02201) and (21127 or 21128 or 15151)
Location(s): Pittsburgh



Learning Objectives

By completing this course, a student will:
  •  Appreciate that computation is inseparable from modern biology and that some problems in computational biology (such as suffix arrays, database search, and local alignment) are so fundamental that they have have influenced computer science as well.
  • Transform poorly defined inquiries about biological data analysis into well-defined computational problems that sometimes make simplifying assumptions.
  • Learn and implement fundamental algorithm- and method-design approaches to a wide array of biological problems. These techniques include: dynamic programming, genetic algorithms, local search, statistical significance testing, deep learning, and expectation maximization.
  • Design, implement, and execute a computational biology project that provides insights at the edge of known biology and that exploits computational techniques learned in the course. Communicate one’s scientific findings via both an oral presentation and an essay.

Assessment Structure

Coursework will consist of the following components: 

  • Homework assignments (40% of grade)

Homework assignments will comprise two parts.

  1. Automatically graded programming assignments (20% of grade) will ask you to implement many of the algorithms forming the great ideas for the course. Programming assignments must be completed on your own (unless noted otherwise) and turned in to the autograder by a given deadline.
  2. Theory questions (20% of grade) will be provided before each lecture to encourage students to build their understanding as well as review the material and arrive to class prepared.
  • Examinations (30% of grade)

The two midterms and final exam will test knowledge of the material from the class.

Midterm 1 (15% of grade)

Midterm 2 (15% of grade)

The midterms will not be cumulative: midterm 2 will cover material encountered after midterm 1. That having been said, later material in the class builds upon the earlier material, so it is important to know the earlier material.

The final will be comprehensive, i.e., it will cover all the material from the class.

  • Project (20% of grade)

We want this course to empower you to find your own great ideas in computational biology. Accordingly, you will complete a project analyzing a biological data set. We will provide more details about the project as the course progresses.

The final week of the course will feature in-class presentations at the end of the course. You will be graded on this presentation as well as a write-up describing your work.

  • Attendance and participation (10% of grade)

Attendance will be taken, and we will have occasional in-class exercises that serve to reinforce the concepts we have covered. These exercises will not be graded, but participation will be expected in order to receive a complete grade for that day.

You are allowed three “dropped” attendance grades without penalty. These can be used for any purpose.

Last updated: April 2021