Predicting the 3D Structure of Proteins from just their Amino-Acid Sequences

Regents' Scholars mini-course

Kevin Karplus
University of California, Santa Cruz
Sept 2000

Bioinformatics is an exciting new field at the intersection of biology, chemistry, statistics, and computer science. The goal is to try to make useful information from the enormous amount of data being collected by DNA sequencing projects (such as the Human Genome Project) and by new DNA chip technologies. Challenging tasks include such things as finding the genes in the genome, predicting when genes will be turned on and off, predicting the proteins that will be create and what their structure and function is.

This class will focus on just one of these problems---predicting what compact three-dimensional form a protein molecule folds into, given only the sequence of amino acids that make up the protein.

No specific prior knowledge of statistics or biology will be assumed, but comfort with math is expected. We'll do some exercises using structure predictors that are publicly available on the world-wide web.

This page will contain notes and exercises for the course---including some preparation that can be done before the course starts. I have not had time to write up the exercises yet---it might be worth browsing through my home page or the research pages of the bioinformatics group. The transparencies that I'm using TWTh are available as a Postscript file.

To start 2000-2001, Richard Hughey and I are working on a BS in Bioinformatics program. This has NOT been approved yet. We have a bubble chart for this that is also availble in Postscript for printing.

I've been very busy all summer on the CASP4 protein-structure prediction experiment. This is a real fun experiment (it feels more like a contest, sometimes), in which prediction researchers have to predict protein structures and register their predictions---without knowing the true structures. It is extremely rare in science these days to have large-scale experimental tests of theories by requiring many predictions. Almost all the scientific literature includes only predictions that turned out to be correct---those that are not confirmed by experiment rarely get published. So all theories and prediction methods normally look good---the CASP experiment adds an important does of reality checking to the field, and has helped the field advance significantly.

Kevin Karplus is an Associate Professor of Computer Engineering. He received his PhD in Computer Science from Stanford in 1983, though his earlier studies were in mathematics (MS Stanford 1976, BS Michigan State 1974). His research at UCSC was initially in CAD tools for VLSI design (mainly logic minimization), but since 1993 he has focussed on analyzing sequence data from the massive genome sequencing projects. In particular he has worked on finding distant relationships between protein sequences and predicting protein structure from sequence data. In addition to teaching Bioinformatics (CMPS 243), he currently teaches Technical Writing for Computer Scientists and Engineers (CMPE 185) and Digital Logic Design (CMPE 100).

SoE home

Kevin Karplus's home page

Biomolecular Engineering Department

UCSC Bioinformatics research

Questions about page content should be directed to Kevin Karplus
Biomolecular Engineering
University of California, Santa Cruz
Santa Cruz, CA 95064
USA
karplus@soe.ucsc.edu
1-831-459-4250
318 Physical Sciences Building