Abstract title: Using hidden Markov models to recognize protein folds Abtract author: Kevin Karplus Presenting author: Kevin Karplus Institution: University of California, Santa Cruz e-mail: karplus@soe.ucsc.edu Invited? Yes (by Andrew Torda) Symposium title: Protein Structure Prediction Student poster? No I don't have access at the moment to RTF or Word, so here is plain text: Using hidden Markov models to recognize protein folds Kevin Karplus The protein-folding problem, in its purest form, is too difficult for us to solve in the next several years, but we need structure predictions now. One solution is to try to recognize the similarity between a target protein and one of the thousands of proteins whose structure has been determined experimentally. For very similar proteins, the relationships are easy to find and good models can be built by copying the backbone (and even some sidechains) for the homologous protein of known structure. For less similar proteins (in the ``twilight zone''), the fold-recognition problem is more challenging, but it is often possible to find useful similarities. Using evolutionary information helps enormously in recognizing remote relationships, and one convenient way to summarize a family of homologs is with a hidden Markov model (HMM). Homologs can be found and an HMM built by an iterated search, starting from a single target sequence. The resulting target HMM can be used to score the sequences of all proteins of known structure. Similarly, homologs can be found and HMMs built for template proteins of known structure and used to score the target sequence. Combining both target-model and template-library results reduces the false positive rate. Some further improvements can be made by predicting local structural properties of the target sequence (such as secondary structure or solvent accessibility) and adding these predictions to the HMM used to score the template sequences. Fold-recognition techniques based on these HMMs have performed quite well in blind prediction experiments (CASP2, CASP3, and CASP4) and are doing better than threading techniques based on pairwise potentials. New techniques for predicting new folds---especially David Baker's program Rosetta---are beginning to mature and may replace (or be merged with) fold-recognition methods in the next few years.