Overview Fold recognition was performed using the Target98 (SAM-T98) method [3] using SAM version 2.1.1 [1], a refinement of the methods developed by this group for CASP2 [2]. This method attempts to find and multiply align a set of homologs to a given sequence, then create an HMM from that multiple alignment. First, a set of sequence weights is determined from the alignment. Next, Modelfromalign is used to build the model from the alignment and the sequence weights. Finally, hmmscore performs a local, all-paths scoring of the sequences, using a reversed-sequence normalization feature. The weighting method, detailed in upcoming publications [3,4], combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an entropy method to set the final weights. Alignment generation The initial step uses BLASTP to search NRP twice: once to produce a set of very close homologs, and once to produce a set of possible homologs. The method then uses multiple iterations of a selection, training, and alignment procedure. Each iteration involves an initial alignment, a set of search sequences, a threshold value, and a transition regularizer. The first iteration uses a single sequence (or seed alignment) as the initial alignment and the close homologs found by BLASTP are used as the search set. The threshold is set very strictly, so that only good matches to the sequence are considered. This iteration uses a transition regularizer that was designed to match the gap costs used by BLASTP. On subsequent iterations the input alignment is the output from the previous iteration, the search set is the larger set of possible homologs found by BLASTP, and the thresholds are gradually loosened. The second through second-from-last iteration use a ``long-match'' transition regularizer, and the final iteration uses a transition regularizer trained on FSSP alignments. References [1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996. http://www.cse.ucsc.edu/research/compbio/sam.html. [2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R. Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and Genetics, Suppl. 1, 134-9, 1997. [3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06, Department of Computer Engineering, Univ. of California, Santa Cruz, 1998. [4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard, and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998. [5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574-578, Nov 1994. [6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S. Mian, and D. Haussler, CABIOS 12(4):327-345, 1996.