BI524
Computational Foundations of Bioinformatics
Tu Th 1:30--3:00 in Higgins 425.

Office hours: Monday 5:00 -- 6:00, Thursday 10:30--11:30 in Higgins 577, or by appointment
Course description | Text | Grading policy | Academic Integrity Policy | Homework | Class Notes | Demos | Tests

Course description

Biology is increasingly a field dominated by high-throughput methods, yielding large data sets which require data analysis using both public domain/commercial software as well as new algorithms to be implemented in a programming language. Bioinformatics is an interdisciplinary area concerned with the application of mathematics, statistics and programming to solve mainstream problems in biology, problems such as the following.

In this course, you will learn how to write programs in the Python programming language, in order to "parse" biological data -- files of protein and RNA 3-dimensional conformations, annotated genomic data from NCBI (National Center for Biotechnology Information) and EMBL (European Molecular Biology Laboratory), how to run compiled code, such as BLAST or Vienna RNA Package from within a script, and to parse the output, how to build a bioinformatics web server, and how to train and test a support vector machine for a bioinformatics classification problem, such as determining RNA polymerase binding sites within a genome.

The goal of the course, which assumes no prior experience in computer programming, is to enable you to work on a UNIX platform, the most important operating system for bioinformatics research, and to to write interpreted programs called scripts, for the problems listed in the previous paragraph. In the course, we will focus principally on the language Python, a simple, elegant scripting language, and towards the end of the course will additionally cover aspects of Perl.

Although this course has no prerequisites, you may find it helpful to have already taken BI420 "Introduction to Bioinformatics", a non-programming introduction to bioinformatics, some databases and public domain tools. BI420 is by no means a requirement -- you can certainly take BI424 without first having learned about current biological databases and public domain tools. However any biology major, who wants to be able to work with biological data beyond the limitations provided by public domain web servers, will want to learn the techniques of "scripting" taught in this course.

Return to table of contents


Texts

Required Texts
  1. "Starting Out With Python", by Tony Gaddis, Pearson/Addison-Wesley Publishing Company, ISBN-13:978-0-321-53711-9 ISBN-10:0-321-53711-4 (2009).
  2. "Developing Bioinformatics Computer Skills", by Cynthia Gibas and Per Jambeck, O'Reilly & Associates, Inc. (2001), ISBN 1-56592-664-1.
  3. Perl run-time environment, documentation and tutorial: http://www.perl.org/.
  4. Python run-time environment, documentation and tutorial: http://www.python.org/.

Optional Texts Reference list of good texts if you choose to go on in bioinformatics. (Do NOT purchase. This list is provided for those who get really interested in bioinformatics and would like some suggested texts for future reading.)

  1. Beginning Perl for Bioinformatics: An introduction to Perl for Biologists, by J. Tisdall, O'Reilly (2001).
  2. "Python Essential Reference", Second Edition, David M. Beazley, New Riders Publishing (a Prentice-Hall company), ISBN 0-7357-1091-0
    Excellent reference work with good glossary for finding Python syntax. See http://islab.cs.uchicago.edu/python/.
    If you'd really like program efficiently in Python, then I've found this book to be indispensible (i.e. strongly recommended).
  3. "Bioinformatics: A practical guide to the analysis of genes and proteins", edited by A.D. Baxevanis and B.F.F. Ouellette, second edition, Wiley & Sons, Inc. (2001).
  4. "Learning the UNIX Operating System", Fourth Edition, by Jerry Peek, Grace Todino & John Strang, O'Reilly & Associates, Inc., ISBN: 1-56592-390-1
    Unix is the best platform for efficient work in bioinformatics, so this tutorial will help you to learn Unix. Though not required in this introductory course, since I work on Unix, all class examples, etc. will be demonstrated from a Linux platform, rather than Macintosh or Windows. We will not spend class time covering Unix; however, if you plan to do research in computational biology, you'll need to learn Unix on your own.

Return to table of contents


Grading Policy

Homework, class participation 30%
Midterm 30%
Final Exam 40%

The grading policy is subject to change. If so, then this will be clearly announced with ample time.

Boston College Academic Integrity Policy

Academic integrity is central to the mission of higher education. Please observe the highest standards of academic integrity in this course. Please review the standards and procedures that are published in the univeristy catalog and on the web, at: http://www.bc.edu/offices/stserv/academic/resources/policy/#integrity. Make sure that the work you submit is in accordance with university policies. If you have any questions, please consult with me. Violations will be reported to the Deans' Office and reviewed by the College's Committee on Academic Integrity. This could result in failure in the course or even more severe sanctions.

Return to Table of Contents