The SAM-T06 hand predictions use methods similar to SAM_T04 in CASP6 and the SAM-T02 method in CASP5. We start with a fully automated method (implemented as the SAM_T06 server): Use the SAM-T2K and SAM-T04 methods for finding homologs of the target and aligning them. The hand method also uses the experimental new SAM-T06 alignment method, which we hope is both more sensitive and lass prone to contamination by unrelated sequences. Make local structure predictions using neural nets and the multiple alignments. We currently use 8 local-structure alphabets: DSSP STRIDE STR2 an extended version of DSSP that splits the beta strands into multiple classes (parallel/antiparallel/mixed, edge/center) ALPHA an discretization of the alpha torsion angle: CA(i-i), CA(i), CA(i+1), CA(i+2) BYS a discretization of Ramachandran plots, due to Bystroff CB_burial_14_7 a 7-state discretization of the number of C_beta atoms in a 14 Angstrom radius sphere around the C_beta. near-backbone-11 an 11-state discretization of the number of residues (represented by near-backbone points) in a 9.65 Angstrom radius sphere around the sidechain proxy spot for the residue. DSSP_EHL2 CASP's collapse of the DSSP alphabet DSSP_EHL2 is not predicted directly by a neural net, but is computed as a weighted average of the other backbone alphabet predictions. We hope to add more networks for other alphabets over the summer. We make 2-track HMMs with each alphabet (1.0 amino acid + 0.3 local structure) and use them to score a template library of about 8000 (t06), 10000 (t04), or 15000 (t2k) templates. The template libraries are expanded weekly, but old template HMMs are not rebuilt. We also used a single-track HMM to score not just the template library, but a non-redundant copy of the entire PDB. One-track HMMs built from the template library multiple alignments were used to score the target sequence. All the logs of e-values were combined in a weighted average (with rather arbitrary weights, since we still have not taken the time to optimize them), and the best templates ranked. Alignments of the target to the top templates were made using several different alignment methods (mainly using the SAM hmmscore program, but a few alignments were made with Bob Edgar's MUSCLE profile-profile aligner). Generate fragments (short 9-residue alignments for each position) using SAM's "fragfinder" program and the 3-track HMM which tested best for alignment. Residue-residue contact predictions are made using mutual information, pairwise contact potentials, joint entropy, and other signals combined by a neural net. The contact prediction method is expected to evolve over the summer, as new features are selected and new networks trained. Then the "undertaker" program (named because it optimizes burial) is used to try to combine the alignments and the fragments into a consistent 3D model. No single alignment or parent template was used as a frozen core, though in many cases one had much more influence than the others. The alignment scores were not passed to undertaker, but were used only to pick the set of alignments and fragments that undertaker would see. Helix and strand constraints generated from the secondary-structure predictions are passed to undertaker to use in the cost function, as are the residue-residue contact prediction. One important change in this server over previous methods is that sheet constraints are extracted from the top few alignments and passed to undertaker. After the automatic prediction is done, we examine it by hand and try to fix any flaws that we see. This generally involves rerunning undertaker with new cost functions, increasing the weights for features we want to see and decreasing the weights where we think the optimization has gone overboard. Sometimes we will add new templates or remove ones that we think are misleading the optimization process. Target T0303 was a fairly straightforward comparative modeling target, with even BLAST able to find 2ah5A as a good template. Despite this, our automatic prediction picked a different template for the inserted domain N17-R95. It also rather mangled the N17-R95 domain in attempting to close gaps. We noticed that other servers followed 2ah5A very closely, even though the alignment was not as good in this domain as in the outer domain. We optimized the models obtained from the servers, to see whether they were doing better at finding a good alignment to a template. The final optimization try3-opt2 was derived primarily from Pmodeller6_TS1, though ROBETTA_TS3 and ROBETTA_TS4 also did well in the early stages of the optimization. We also repotimized from the alignments, with some extra constraints (taken from the alignment to 2ah5A) to keep the helices of the N17-R95 domain packed. (This was try4-opt2.) We decided to optimize N17-R95 separately, then paste it back into the whole-chain models (both from the servers and from our alignments). Note: G197-P209 appears to be incorrectly placed in try4-opt2 and try10-opt2 derived from it. The models from the servers seem to have done a better job here. To keep track of the history: server models => try3 => chimera1 => try5,try7=>try9 alignments => try4 => chimera2 => try6,try8=>try10 Model 1 is try9-opt2, which is an optimizaton of the outer domain from the servers with the N17-R95 subdomain pasted in. Model 2 is try10-opt2, which is based entirely on models we generated, with the N17-R95 subdomain pasted into try4-opt2 and reoptimized. Model 3 is try3-opt2, reoptimized from the server models. Model 4 is try4-opt2, created from the alignments with some extra constraints to keep the N17-R95 domain from unfolding. Model 5 is just SCWRL-based sidechain replacement on an aligment to 2ah5A.