The SAM-T08 hand predictions use methods similar to SAM_T06 in CASP7.

We start with a fully automated method (implemented as the SAM-T08-server):
   
    Use the SAM-T2K, SAM-T04, and SAM-T06 methods for finding homologs
    of the target and aligning them.

    Make local structure predictions using neural nets and the
    multiple alignments.  These neural nets have been newly trained
    for CASP8 with an improved training protocol.  The neural nets for
    the 3 different multiple sequence alignments are independently
    trained, so combining them should offer improved performance.
      
    We currently use 15 local-structure alphabets: 
	STR2	an extended version of DSSP that splits the beta strands
		into multiple classes (parallel/antiparallel/mixed,
					edge/center)
	STR4	an attempt at an alphabet like STR2, but not requiring DSSP.
		This alphabet may be trying to make some irrelevant
		distinctions as well.
	ALPHA	an discretization of the alpha torsion angle:
		CA(i-i), CA(i), CA(i+1), CA(i+2)
	BYS	a discretization of Ramachandran plots, due to Bystroff
	PB	de Brevern's protein blocks
	
	N_NOTOR
	N_NOTOR2
	O_NOTOR
	O_NOTOR2	alphabets based on the torsion angle of
			backbone hydrogen bonds
	
	N_SEP
	O_SEP		alphabets based on the separation of donor and
			acceptor for backbone hydrogen bonds
	
	CB_burial_14_7	a 7-state discretization of the number of C_beta
		atoms in a 14 Angstrom radius sphere around the C_beta.
	near-backbone-11 an 11-state discretization of the number of
	      residues (represented by near-backbone points) in a 
	      9.65 Angstrom radius sphere around the sidechain proxy
	      spot for the residue.
	
	DSSP_EHL2	CASP's collapse of the DSSP alphabet
			DSSP_EHL2 is not predicted directly by a
			neural net, but is computed as a weighted
			average of the other backbone alphabet predictions. 
    
    We make 2-track HMMs with each alphabet with the amino-acid track
    having a weight of 1 and the local structure track having a weight
    of 0.1 (for backbone alphabets) or 0.3 (for burial alphabets).
    We use these HMMs to score a template library of about
    14000 (t06), 16000 (t04), or 18000 (t2k) templates.
    The template libraries are expanded weekly, but old template HMMs
    are not rebuilt.  The target HMMs are used to score consensus
    sequences for the templates, to get a cheap approximation of
    profile-profile scoring, which does not yet work in the SAM package.
    
    We also used single-track HMMs to score not just the template
    library, but a non-redundant copy of the entire PDB.  This scoring
    is done with real sequences, not consensus sequences.
    
    All the target HMMs use a new calibration method the provides more
    accurate E-values than before, and can be used even with
    local-structure alphabets that used to give us trouble (such as
    protein blocks).

    One-track HMMs built from the template library multiple alignments
    were used to score the target sequence.  Later this summer, we
    hope to be able to use multi-track template HMMs, but we have not
    had time to calibrate such models while keeping the code
    compatible with the old libraries, so the template libraries
    currently use old calibrations, with somewhat optimistic E-values.

    All the logs of e-values were combined in a weighted average (with
    rather arbitrary weights, since we still have not taken the time
    to optimize them), and the best templates ranked. 
    
    Alignments of the target to the top templates were made using
    several different alignment settings on the SAM alignment software.

    Generate fragments (short 9-residue alignments for each position)
    using SAM's "fragfinder" program and the 3-track HMM which tested
    best for alignment.

    Residue-residue contact predictions are made using mutual
    information, pairwise contact potentials, joint entropy, and other
    signals combined by a neural net.  Two different neural net
    methods were used, and the results submitted separately.
    
    CB-CB constraints were extracted from the alignments and a
    combinatorial optimization done to choose a most-believable
    subset.  

    Then the "undertaker" program (named because it originally
    optimized burial) is used to try to combine the alignments and the
    fragments into a consistent 3D model.  No single alignment or
    parent template was used as a frozen core, though in many cases
    one had much more influence than the others.  The alignment scores
    were not used by undertaker, but were used only to pick the set
    of alignments and fragments that undertaker would see.  
    
    The cost functions used by undertaker rely heavily on the
    alignment constraints, on helix and strand constraints generated
    from the secondary-structure predictions, and on the neural-net
    predictions of local properties that undertaker can measure.
    The residue-residue contact predictions are also given to
    undertaker, but have less weight.  There are also a number of
    built-in cost functions (breaks, clashes, burial, ...) that are
    included in the cost function.
    
    The automatic script runs the undertaker-optimized model through
    gromacs (to fix small clashes and breaks) and repacks the
    sidechains using Rosetta, but these post-undertaker optimizations
    are not included in the server predictions.  They can be used in
    subsequent re-optimization.
    
After the automatic prediction is done, we examine it by hand and try
to fix any flaws that we see.  This generally involves rerunning
undertaker with new cost functions, increasing the weights for
features we want to see and decreasing the weights where we think the
optimization has gone overboard.  Sometimes we will add new templates
or remove ones that we think are misleading the optimization process.
We often do "polishing" runs, where all the current models are read in
and optimization with undertaker's genetic algorithm is done with high
crossover. 

Some improvements in undertaker include better communication with
SCWRL for initial model building form alignments (now using the
standard protocol that identical residues have fixed rotamers, rather
than being reoptimized by SCWRL), more cost functions based on the
neural net predictions, multiple constraint sets (for easier
weighting of the importance of different constraints), and some new
conformation-change operators (Backrub and BigBackrub).

We also created model-quality-assessment methods for CASP8, which we
are applying to the server predictions.  We do two optimizations from the
top 10 models with two of the MQA methods, and consider these models
as possible alternatives to our natively-generated models.


For this REFINEMENT model, we did a standard prediction for the
residues included in the model, then tried optimizing both from the
predicted model and from the provided starting point.  Since we were
directed to focus on Y85-G92, I also tried cutting and pasting that
region from other models (particularly server models).  This provided
slightly different starting points and a bit more flexibility in
rebuilding that loop.

My tools are not optimized for fine-grain placement of atoms, so I'm
not sure that I can make any improvement over a 1.34 Angstrom CA_RMSD
model. 

All the models submitted have a disulfide constraint: C79-C101.  The
CYS residues seemed too close not to be interacting, though with this
being part of a cytosolic human protein a disulfide somehow seems
wrong.  Since the cys are on the edge of the model, so I wonder if a
metal-binding site in the protein has somehow been broken by taking
out just this fragment.  A disulfide here could correct for the
missing metal. 

Model


1 TR432.try7-opt3.pdb	# < chimera-try6-try5
	# best undertaker score (with try3,try4,try7.costfcn)
	chimera-try6-try5:
		mostly from TR432.try6-opt3.gromacs0.repack-nonPC.pdb
		I111-R130 is from try5-opt3
	chimera-try6-try5 was an attempt to use the tighter packing of
	the C-terminal helix in try5 with the more standard loop in try6.

	
2 TR432.try7-opt3.gromacs0.repack-nonPC.pdb	# < chimera-try6-try5
	# best rosetta energy
	Rosetta and undertaker disagree somewhat on clashes, so
	running an undertaker-produced model through gromacs energy
	minimization to relieve tiny clashes, then repacking the
	sidechains (except PRO and CYS) with Rosetta produces the best
	Rosetta energy scores, even though few atoms move much.

3 TR432.try6-opt3.pdb	# < chimera-try4-MULTICOM
	chimera-try4-MULTICOM:
		mostly TR432.try4-opt3.repack-nonPC.pdb
		A82-L95 from MULTICOM-REFINE_TS2
	This chimera was an attempt to get a slightly different loop
	as a starting point for further optimization.  Sometimes doing
	this sort of cut-and-paste introduces breaks into the backbone
	and has slightly different shape to the fragments, kicking the
	optimization out of a local optimum and allowing it to find a
	different one.

4 TR432.try5-opt3.pdb	# < try3-opt3 < try2-opt3.gromacs0.repack-nonPC < chimera-init-try1
	chimera-init-try1:
		mostly TR432.pdb
		Y85-R97 from try1-opt3
	try1-opt3 was the automatically generated prediction for TR432
	It had a rather different idea of what the interesting loop should be.
	Initially I favored this model, but it looks like the loop now
	blocks access to the putative active site: Y43,N81,Y85,N86, so
	I downgraded the model.  The try5 run had some extra
	constraints to try to pull the C-terminal helix in tighter,
	which were fairly successful, so I copied that portion of the
	model into chimera-try6-try5 to make the final model.

5 TR432.try4-opt3.repack-nonPC.pdb	# < TR432
	# optimized directly from provided model, using just undertaker.
	# helix constraints taken from the initial model and from try1-opt3
	# disulfide C79-C101