PFRMAT TS
TARGET T0052
AUTHOR 9070-5088-8627
REMARK 
REMARK Prediction date: Wednesday June 10, 1998
REMARK Group name: UCSC-compbio
REMARK Authors: Christian Barrett, Melissa Cline, Mark Diekens, Kevin Karplus,
REMARK 	 David Haussler and Richard Hughey
REMARK University of California, Santa Cruz
REMARK 
METHOD 
METHOD UCSC Computational Biology
METHOD 
METHOD All experiments were performed using SAM version 2.1.1 [1] using a
METHOD refinement of the methods used by this group in CASP2 [2].  
METHOD 
METHOD Overview of the method
METHOD 
METHOD Fold recognition was performed using the Target98 (SAM-T98) method
METHOD [3].  This method attempts to find and multiply align a set of
METHOD homologs to a given sequence, then create an HMM from that multiple
METHOD alignment.
METHOD 
METHOD First, a set of sequence weights is determined from the alignment.  Next, 
METHOD Modelfromalign is used to build the model from the alignment and the 
METHOD sequence weights.  Finally, hmmscore performs a local, all-paths scoring 
METHOD of the sequences, using a reversed-sequence normalization feature.
METHOD 
METHOD The weighting method, detailed in upcoming publications [3,4],
METHOD combines the Henikoffs' scheme [5], Dirichlet mixtures [6], and an
METHOD entropy method to set the final weights.
METHOD 
METHOD Alignment generation
METHOD 
METHOD The initial step uses WU-Blast, BLASTP version 2.0aMP from Washington 
METHOD University, to select the potential homologs from the non-redundant database.  
METHOD NRP is searched twice to produce two sets of homologs: one of very close 
METHOD homologs (E<0.00003) and one of possible homologs (E<500).
METHOD 
METHOD The target98 method then uses multiple iterations of a selection,
METHOD training, and alignment procedure.  For each iteration it needs an
METHOD initial alignment, a set of sequences to search, a threshold value,
METHOD and a transition regularizer.  Alignments in the library were built
METHOD with 4 iterations, with thresholds -40, -30, -24, -16, but the target
METHOD alignment was built with 6, with thresholds -50, -40, -30, -22, -16, and
METHOD -14.
METHOD 
METHOD On the first iteration the single sequence (or seed alignment) passed
METHOD to the method is used as the initial alignment and the close homologs
METHOD found by WU-BLAST are used as the search set.  The threshold is set
METHOD very strictly, so that only really good matches to the sequence are
METHOD considered.  This iteration uses a transition regularizer that was set
METHOD up to try to match the gap costs used by WU-Blast.
METHOD 
METHOD On subsequent iterations the input alignment is the output from the
METHOD previous iteration and the search set is the larger set of possible
METHOD homologs found by WU-Blast.  The thresholds are gradually loosened.
METHOD For the second through second-from-last iteration, a ``long-match''
METHOD transition regularizer is used, and for the final iteration a
METHOD transition regularizer trained on FSSP structural alignments is used.
METHOD 
METHOD References
METHOD [1] R. Hughey and A. Krogh, CABIOS 12(2): 95-107, 1996.
METHOD     http://www.cse.ucsc.edu/research/compbio/sam.html.  
METHOD [2] K. Karplus, K. Sjolander, C. Barrett, M. Cline, D. Haussler, R.
METHOD     Hughey, L. Holm, and C. Sander, Proteins: Structure, Function, and 
METHOD     Genetics, Suppl. 1, 134 9, 1997.
METHOD [3] K. Karplus, C. Barrett, and R. Hughey, Technical Report UCSC-CRL-98-06,
METHOD     Department of Computer Science, University of California, Santa Cruz, 1998.
METHOD [4] J. Park, K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard,
METHOD     and C. Chothia, http://cyrah.med.harvard.edu/~jong/assess_final.html, 1998.
METHOD [5] S. Henikoff and J. C. Henikoff, JMB, vol 243, pp 574 578, Nov 1994.
METHOD [6] K. Sjolander, K. Karplus, M. P. Brown, R. Hughey, A. Krogh, I. S.
METHOD    Mian, and D. Haussler, CABIOS, vol 12, pp 327 345, Aug 1996.
METHOD 
METHOD Results
METHOD 
METHOD The Target98 method found no homologs in NRP for T52 other than
METHOD itself, and so the model built from the target98 alignment is not
METHOD likely to be very powerful in finding remote homologs.
METHOD 
METHOD The top scoring possible homologs in PDB were as follows:
METHOD 	chain	score		FOUND by model
METHOD 	1pmd    -6.21		t52
METHOD 	1hsq    -4.51		1hsq library model
METHOD 	1pdgA   -3.25		t52
METHOD 	1broA   -2.83		t52
METHOD 
METHOD 1pmd did get a fairly good score, though its structural homologs 1btl,
METHOD 2bltA, and 3pte did not and -6.2 is in the range where the probability
METHOD of a match being a false positive is about 70%.  The alignment of T52
METHOD to 1pmd was only moderately compact and included an unsupported helix
METHOD at one end.  Also the two known cystine bridges did not map to close
METHOD positions in 1pmd, so we decided that this match was unlikely to be
METHOD correct.
METHOD 
METHOD The alignment of T52 to 1hsq seemed to match only a tiny fragment:
METHOD 	WQPSNFIE
METHOD 	WFPSNYVE
METHOD with a few other very short matches scattered along the chain.
METHOD This is a motif with a strand and tight turn or short helix.  While it
METHOD is an interesting bit of secondary structure, it is too small to be
METHOD suitable for fold recognition.  There is no similar motif in 1pmd.
METHOD 
MODEL  1
PARENT NONE
TER
END