Overview

Fold recognition was performed using the Target98 (SAM-T98) method
[3] using SAM version 2.1.1 [1], a refinement of the methods developed
by this group for CASP2 [2].  This method attempts to find and multiply 
align a set of homologs to a given sequence, then create an HMM from that 
multiple alignment.

First, a set of sequence weights is determined from the alignment.  Next, 
Modelfromalign is used to build the model from the alignment and the 
sequence weights.  Finally, hmmscore performs a local, all-paths scoring 
of the sequences, using a reversed-sequence normalization feature.

The weighting method, detailed in upcoming publications [3,4],
combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an
entropy method to set the final weights.

Alignment generation

The initial step uses BLASTP to search NRP twice: once to produce a set
of very close homologs, and once to produce a set of possible homologs.

The method then uses multiple iterations of a selection, training, and 
alignment procedure.  Each iteration involves an initial alignment, a set 
of search sequences, a threshold value, and a transition regularizer. 

The first iteration uses a single sequence (or seed alignment) as the 
initial alignment and the close homologs found by BLASTP are used as the 
search set.  The threshold is set very strictly, so that only good matches 
to the sequence are considered.  This iteration uses a transition regularizer 
that was designed to match the gap costs used by BLASTP.

On subsequent iterations the input alignment is the output from the
previous iteration, the search set is the larger set of possible
homologs found by BLASTP, and the thresholds are gradually loosened.
The second through second-from-last iteration use a ``long-match''
transition regularizer, and the final iteration uses a transition regularizer 
trained on FSSP alignments.

References
[1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996.
    http://www.cse.ucsc.edu/research/compbio/sam.html.  
[2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R.
    Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and 
    Genetics, Suppl. 1, 134-9, 1997.
[3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06,
    Department of Computer Engineering, Univ. of California, Santa Cruz, 1998.
[4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard,
    and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998.
[5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574-578, Nov 1994.
[6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S.
   Mian, and D. Haussler, CABIOS 12(4):327-345, 1996.