Wed Jun  7 09:28:24 PDT 2006
T0321
Make started Wed Jun  7 09:29:48 PDT 2006
Running on orcas.cse.ucsc.edu

Wed Jun  7 16:47:27 PDT 2006 Kevin Karplus

No good hits---ab initio or difficult fold recognition.

Conservation signals are strong in all three alignments, but not quite identical.
T06 and t04 look the same in sequence profile, but t2k has a few
different conserved residues.

Top alignments are similar, but all a bit fragmentary.  This is
probably and alpha/beta fold with the alpha helices on both sides of a
beta sheet.

I'm hoping that the sheet constraints from the alignments will help in
assembling the fragmentary alignments.


Wed Jun  7 21:05:28 PDT 2006 Kevin Karplus

We have a pretty good model after about V105 (or maybe K123).
Before V105 is pretty much junk though.

Also, the two C-terminal strands probably belong with the rest of the sheet.

We should do subdomains.

I've started M1-K123 and V105-K251 on the farm cluster.

The model that we have seems to come mainly from 1vpdA, the
top-scoring model with the HMMs (though with E-value 0.05).


Mon Jun 25 15:29 PDT 2006 Zack Sanborn

I'm taking over this protein.  A soft deadline is this Wednesday.

Kevin's first thought was to break this protein up into subdomains around
the following amino acids:  Lys103 Gly122.  And, it looks like he already
did this with the subdomains: domain 1 = M1-K123 and domain 2 = V105-K251.
So, that's pretty cool.

I'll take a look at these subdomains and see if we can piece these together
somehow.


Mon Jun 25 17:00 PDT 2006 Zack Sanborn

I've started two unconstrained optimization runs on the two subdomains (both
called try2).  I've also started an optimization run on a chimera of the
two subdomains, that I made by superimposing the two subdomain PDBs on the
try1-opt2 model and the copying-and-pasting the two subdomains together. 
(not sure if this would still be called a "chimera" in this case).

Start try2 for M1-K123 on orcas, and try2 for V105-K251 on camano.
 
Unfortunately, the superposition didn't align the chains together all too 
well (a sizable gap).  I started another optimization run to hopefully get 
these domains to stick together.  Not sure if this is the right way of doing
this, but I didn't see any other way.  

Started try2 for the chimera on camano.

Mon Jun 25 18:04 PDT 2006 Zack Sanborn

Whoops, I'm using the sheet and helix constraints from the alignment, not 
from the models.  I would stop the job (try2 at /T0321), but there I've started
two jobs on camano, and both appear identical (because both are try2 jobs, one
on a subdomain and one of the full chimera).

So, I've made a try3 that is what try2 should have been and am running it on 
lopez.  I've added the sheet constraints from both subdomains (both helix
and sheet constraints).  So, if try2 totally screws up, hopefully this will
produce a better model. 


Mon Jun 25 21:53 PDT 2006 Zack Sanborn

The try2's for the subdomains are done and have improved the structure.  Also,
the chimera I made was optimized using the original cost function (try2) and 
a run with the helix and sheet constraints from their respective subdomain
models.  try2-opt2 is the highest scoring model thus far.

I've started two more runs.  try4 starts with a new chimera, made the same way
as described above for try2, but using the newly optimized subdomain models.
try5 takes the helix and sheet constraints from these models and essentially
starts over to see if it comes up with a different or better structure. 

After these try's finish, I'd like to do a polishing run starting from all 
models.

Tue Jun 27 07:59:58 PDT 2006 Kevin Karplus

try4 is the best-scoring with the try4=try5 and unconstrained
costfcns, and is just behind try2 on the try1 costfcn.

I made a preliminary submission:

    Model 1 is try4-opt2, optimized from a chimera of try2 runs on the two
	    subdomains (not sure where the crossover point was taken).

    Model 2 is try2-opt2, an optimization from a chimera of try1 runs on
	    the two subdomains.

    Model 3 is try1-opt2, the fully automatic run.

    Model 4 is just sidechain replacement by SCWRL on an alignment to 1vpdA.

    Model 5 is just sidechain replacement by SCWRL on an alignment to 1vmeA.

The README file really needs more notes on chimeras---exactly where
was the crossover point?  (At least this README told me which models
were used for making the chimeras.)

Tue Jun 27 08:19:30 PDT 2006 Kevin Karplus

I think that I like the hairpin at the end of the model, but it is
getting lost in the most recent runs.  Perhaps we could restore it?


Tue Jun 27 11:20 PDT 2006 Zack Sanborn

Sorry Kevin, the crossover point was at Lys123 for both chimeras.  Next time, 
I'll be more explicit. 

I'll take a look at trying to get the hairpin you want back in the models 
tomorrow. 


Wed Jun 28 12:36:53 PDT 2006 Zack Sanborn

I'm attempting to get that hairpin back.  It was in the try2-opt2 model, but in
neither of the chimeras or subdomain structures.  

First, I've taken all of the helix and sheet constraints from try2-opt2 and have
restarted a structure from the alignments.  Since try2-opt2 was started from the chimera,
I'm hoping that by restarting from the alignments but using the "good" constraints will
come up with a better structure.  This run is called try6 and was started on vashon.

Since try2-opt2 is also one of the best scoring models, I'm doing a polishing run 
that starts from all models.  However, we want to maintain structures that have the hairpin
also because the cost function has the sheet constraint for the hairpin, but aside from 
that is unconstrained.  I'm hoping this will get the best features of all the models while
keeping the hairpin.  This run is called try7 and was started on camano.


Mon Jul  3 13:56:13 PDT 2006 Zack Sanborn

try7 was successful and became the highest scoring model and kept the
beta hairpin we wanted.  A polishing run, try8, using an unconstrained
cost function was started.  This improved the score of the model a
bit.

Right now, I'm not sure what is left to do on this structure.  After
getting an email from Kevin about doing polishing runs from GROMACS
optimized structures (which are screwed up enough to possibly get
models out of a local minimum), I made a try9 that will do just this.
However, I'm not sure if it'll help or just completely screw up the
structure.  We'll see, I guess.

try9 was started (using an unconstrained costfcn) on orcas.  It is starting
from try7-opt2 and try8-opt2 GROMACS optimized structures only.  


Mon Jul  3 14:39:50 PDT 2006 Zack Sanborn

I was looking at how we got to try8-opt2 and found (through the try*.log's):

	try8-opt2 > try7-opt2 > try2-opt2 > try2-chimera


Mon Jul  3 14:59:57 PDT 2006 Zack Sanborn

Apparently, doing the optimization run starting from GROMACS optimized 
structures was the right thing to do... however, I needed to increase
the weights for soft_clashes (from 20 to 50) and breaks (from 50 to 200).
So, I've started a try10 that increases these costs, but starts from
the try7-opt2 and try8-opt2 GROMACS structures like try9.

I decided to keep try9 running because it may be interesting to see
the effect that the different costs have on the structure.


Wed Jul  5 13:58:06 PDT 2006 Zack Sanborn

The new optimization runs (try9 and try10) have finished.  They are the 
top scoring models mostly due to the fact that their breaks have been 
significantly reduced.  The structures look good but are a little 
"foamy".  So, I'm starting a new run, using all structures (but
will likely pick try9-opt2 or try10-opt2) that increases the costs 
for "phobic_fit" and dry5 weights.  Hopefully this will help pack the
protein a little better.

But, starting from the GROMACS structures when you have a lot of breaks
in the Undertaker structures does work in producing a better structure.


Wed Jul  5 18:06:05 PDT 2006 Zack Sanborn

It does appear that this run did help with the structure's packing.  
Currently, using the try11 costfcn, try11-opt2 is the best scoring 
model.  try11 is based off of try9-opt2, which is no surprise considering
try9-otp2 was the best scoring model up to that point.

I just started a new optimization run that is unconstrained.  This is a 
polishing run that will also allow the structure to expand if the 
penalties I put on it for try11 were too constrictive.  We'll see. 

I started try12 on orcas.


Thu Jul  6 16:17:42 PDT 2006 Zack Sanborn

Well, try12 chose try8-opt2 as the model to optimize, not try11.  This 
means that Undertaker prefers the try8 model as opposed to the try11
model that was packed a little better.

Looking at score-all.try12.pretty, we see that try11-opt2 is the fourth
best scoring model behind try12-opt2, try8-opt2, and try12-opt1.  They 
differ by little in overall score.  The big differences are in phobic_fit
(which try11-opt2 does better in) and in side_chain (which try12 and try8
do signficantly better in).  

Not terribly sure what to do next, but I'm going to update the best-models.pdb
file to see how similar the top scoring models are.

I put the following models in superimpose-best.under

ReadConformPDB T0321.try12-opt2.pdb  (model 1)
ReadConformPDB T0321.try8-opt2.pdb   (  "   2)
ReadConformPDB T0321.try11-opt2.pdb  (  "   3)
ReadConformPDB T0321.try9-opt2.pdb   (  "   4)
ReadConformPDB T0321.try4-opt2.pdb   (  "   5)

As expected, all models strongly agree on the first domain of the protein.  For
the other domain, models 1, 2, and 4 were very similar to one another.  This makes 
sense since try12 and try9 were both based on try8.  The try11 structure is 
significantly different in the second domain, which is a good thing since we 
don't want to submit the same structure 5 times.  The try4 model is also different
from the other models but share many of the characteristics of the try12, try9, and
try8 models.  However, it has a long helix where there exists a nice beta-hairpin
in the other models.  The try4 model is the lowest scoring model of the group, but
not by much. 

I think I will try a polishing run from only try11-opt2 to see if that structure
can get any better.

try13, a polishing run for try11-opt2, was started on orcas.


Fri Jul  7 15:30:07 PDT 2006 Zack Sanborn

Well, try13 is the currently best scoring model, with an uncostrained costfcn.  
The structure has an overall bend to it, from the try11-opt2 model.  I didn't 
notice it earlier, but I believe this bend was caused by some strong weights
put on try11 in phobic_fit and dry5 to try to pack the structure better.  But,
I'm glad to see that, with an unconstrained costfcn, try13-opt2 is the best 
scoring model.  

Now, I'm really not sure what to do.  I'll update the best-models.pdb for Kevin. 
He might have an idea what to do, if anything, before we submit tomorrow 
(deadline is Sunday, July 9th).

Actually, I checked the all.breaks.gz file and there are three sizeable breaks:

        T0321.try13-opt2.pdb.gz breaks before (T0321)P207 with cost 2.71589
        T0321.try13-opt2.pdb.gz breaks before (T0321)I141 with cost 1.32544
        T0321.try13-opt2.pdb.gz breaks before (T0321)E167 with cost 1.27245
 
 
that we could try getting rid of.  All other breaks are pretty minimal (i.e. 
less than 1).

I started a new run try14, which will try to minimize this break.  I've upped
the penalty for breaks, gaps, etc.  

Sat Jul  8 10:13:17 PDT 2006 Kevin Karplus

try14-opt2 still has some bad breaks:
Conformation[31] T0321.try14-opt2.pdb.gz has 43 breaks
	T0321.try14-opt2.pdb.gz breaks before (T0321)E167 with cost 1.20615
	T0321.try14-opt2.pdb.gz breaks before (T0321)I141 with cost 0.914793
	T0321.try14-opt2.pdb.gz breaks before (T0321)C175 with cost 0.720551

gromacs closed some of the littler gaps, but opened up new bigger ones
to avoid clashed:
Conformation[30] T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz has 19 breaks
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)C142 with cost 1.8626
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)A176 with cost 1.24646
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)E167 with cost 1.18682
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)P207 with cost 1.06357
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)I141 with cost 1.04853
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)F220 with cost 1.03101
	T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz breaks before (T0321)Q90 with cost 0.712483

I'm still not real pleased with domain 1---we might have done better to
generate more models for that domain.  Our structure doesn't agree
with secondary structure prediction (using either the whole-chain
alignments or the domain 1 alignments).  The rr predictions for domain
1 are adequately matched, but not great.  The conserved residues from
M1-K123/ are not clustered. 

try14-opt2 does look like the best we've got, but I think that
try11-opt2 may be too close to be an alternative model.  Similarly
try12 and try9 may be too close to each other. 

(Rosetta likes best decoys/T0321.try14-opt2.gromacs0.repack-nonPC.pdb.gz )

I'll do one more polishing run, starting from the gromacs0.repack... models.
I'll turn up the packing terms to try to make things a little tighter.
I'll also up clashes and breaks, but remove deep_knot (which is too
slow for use in the optimization).

Try14-opt2, try12-opt2, and try4-opt2 look like good choices to submit.

I wonder what other 2 models I should submit?  Polished single-domains?
Polished server models? (The Robetta models score best.)

Sat Jul  8 10:37:48 PDT 2006 Kevin Karplus

try15-opt2 started on cheep, trying to polish from the
gromacs0.repack-nonPC models.


Sat Jul  8 10:55:09 PDT 2006 Kevin Karplus

V105-K251/ try3 started on lopez to polish single-domain model.

Sat Jul  8 11:03:39 PDT 2006 Kevin Karplus

M1-K123/ try3 started on lopez to polish single-domain model.

Sat Jul  8 11:16:24 PDT 2006 Kevin Karplus

The ROBETTA_TS5 model matches secondary structure well for 1-103, but
is a bit foamy an unconvincing.  Still, we might do well to make a
chimera of it and one of our own models---perhaps optimizing it just
as a single domain to save some time.

Sat Jul  8 11:34:29 PDT 2006 Kevin Karplus

I made M1-K123/decoys/chimera-robetta5-try14.pdb.gz from
  M1-N110 from decoys/servers/ROBETTA_TS5.pdb
  D111-K123 from decoys/T0321.try14-opt2.pdb

I'll try optimizing it with the M1-K123 try1 costfcn (it matches the
constraints better than anything else we have).

Sat Jul  8 12:26:15 PDT 2006 Kevin Karplus

try15-opt2 has smaller clashes and breaks than try14-opt2, but
undertaker doesn't like it quite as well with the try15
cotfcn---packing terms and hbonds are worse.

Rosetta likes best T0321.try15-opt2.gromacs0.repack-nonPC.pdb.gz

Sat Jul  8 12:36:26 PDT 2006 Kevin Karplus

M1-K123/ try3 and try4 have finished.
V105-K251/try3 has also finished.

I'll put them all in superimpose-best.under, look at them and decide
which (if any) to submit.

For M1-K123, try1, try2, and try3 are quite similar.  I'd favor the
smaller breaks and clashes of try3-opt2.  try4-opt2 scores very well
with M1-K123/try3.costfcn, despite not having been optimized for it.

I'll do a polishing run on M1-K123/try4 (starting from the gromacs
models), using the try3 costfcn.

Sat Jul  8 12:49:33 PDT 2006 Kevin Karplus

M1-K123/ try5 started on cheep.

Sat Jul  8 12:51:25 PDT 2006 Kevin Karplus

The V105-K251/ tries 1-3 are all more or less the same.  They all have
the C-terminal helix rather than beta hairpin, so are in the style of
try4-opt2, not the more recent runs.

I don't see much point to submitting the V105-K251 models separately.

Sat Jul  8 13:03:29 PDT 2006 Kevin Karplus

try14/try15 and try12 differ mainly in the placement of the first domain.

try12 and try4 differ mainly in the C-terminal hairpin/helix.

Sat Jul  8 13:08:23 PDT 2006 Kevin Karplus

I'll have a hard time making a chimera of try15-opt2 and M1-K123/try4
(or try5), because there isn't much in common between them, so getting
the domains oriented reasonably will be tough.

Sat Jul  8 13:20:36 PDT 2006 Kevin Karplus

I take that back---we can superimpose fairly well on V91-V97 and
crossover between P89 and Q90.

Sat Jul  8 13:24:23 PDT 2006 Kevin Karplus

For M1-K123 undertaker now likes best try5-opt2, but rosetta still prefers
decoys/T0321.try3-opt2.gromacs0.repack-nonPC.pdb.gz 

Sat Jul  8 13:29:02 PDT 2006 Kevin Karplus

I made a chimera decoys/chimera-try15-domain1-try5.models.pdb.gz
from M1-P89  of M1-K123/try5-opt2 (optimized from ROBETTA_TS5).
It has some bad breaks and clashes (as would be expected from a
chimera), so I'll try optimizing it.

Sat Jul  8 13:33:14 PDT 2006 Kevin Karplus

try16 started on cheep, to optimize the try15/robetta5 chimera.


Sat Jul  8 15:15:29 PDT 2006 Zack Sanborn

I've looked at the following server models to try to find other 
possibilities for the first domain:

	ROBETTA_TS5.pdb (best scoring, Kevin used it <see above>)
	ROBETTA_TS4.pdb
	ROBETTA_TS3.pdb
	ROBETTA_TS2.pdb
	FUGMOD_TS5.pdb
	PROTINFO_TS5.pdb

Only the ROBETTA models appear to have two domains in their models.
It would be impossible to make any chimeras from the unidomain models
FUGMOD_TS5 and PROTINFO_TS5.  The other ROBETTA models (TS2 -- TS4)
have two domains, but I see nothing "better" about the first domains
in any of these three models compared to the best scoring server model
ROBETTA_TS5.  Actually, most of the first domains in these models appear
loosely packed and disordered, especially with model ROBETTA_TS3. 
So, I think Kevin got it right the first time by choosing ROBETTA_TS5.

As I see it, the following models will be good ones to submit:

	try16-opt2 (depending on how it turns out)
	try15-opt2
	try12-opt2
	try4-opt2
	try11-opt2 (iffy, similar to try14/try15)

	or something based off an alignment?
 

Sat Jul  8 15:34:29 PDT 2006 Zack Sanborn

try16 has completed and try16-opt2 is the second best scoring model 
using the try16.costfcn.  It is beat by try14-opt2.  try16-opt2 appears
to do better with soft_clashes and breaks than try14-opt2, but doesn't
do as well with dry5 and dry6.5 (packing) and hbond_geom_beta* terms.

try15-opt2 scores below the try14 models.  So, maybe we should submit
try14-opt2 and try15-opt2?

Sat Jul  8 15:44:50 PDT 2006 Kevin Karplus

try14 and try15 are very similar to each other, so there is no need to
submit both.  Choosing between them depends on which parts of the cost
function you want to weight highest.

I liked try15-opt2 a bit better, though I'm not sure I can articulate why.
(Rosetta also liked it better, because of decreased clashes.

try16-opt2 scores almost as well as try14-opt2, despite less polishing.

I don't like the disulfide bridge in try16-opt2---it seems unlikely
given the unpaired cysteines in the rest of the protein.  The cost
function should incude mabye_metal, but not maybe_ssbond.

Sat Jul  8 16:16:50 PDT 2006 Kevin Karplus

I'll try optimizing try16 with maybe_metal turned on, but not maybe_ssbond.
I also added in some constraints (from try16-opt2 sheets and
try16-opt2.helices) to keep the character the same, but to encourage
improving the sheets and helices.

try17 started on cheep.

Sat Jul  8 16:56:44 PDT 2006 Kevin Karplus

try17 is making minor improvements, but is not getting rid of the disulfide.

Sat Jul  8 17:27:34 PDT 2006 Kevin Karplus

rosetta doesn't even think that try17 improved over try16, as clashes
went up a little.

Sat Jul  8 17:35:29 PDT 2006 Kevin Karplus

I'll submit
	try14-opt2
	try15-opt2
	try17-opt2
	try12-opt2
	try4-opt2

unless someone suggests a replacement for try15, which is too similar
to try14.


Sat Jul  8 17:47:34 PDT 2006 Kevin Karplus

So submitted.