Tue May 16 08:56:43 PDT 2006
T0289
Make started Tue May 16 08:57:48 PDT 2006
Running on shaw
Make started Tue May 16 09:08:24 PDT 2006
Running on shaw

Tue May 16 09:28:59 PDT 2006 Kevin Karplus

Had to restart, because the initial sequence downloaded from the web
site had a formatting error (an extra space after the '>')

This seems to be a comparative model, as the t06 alignment has 12 pdb
sequences in it.

Tue May 16 10:22:40 PDT 2006 Kevin Karplus

BLAST at NCBI (http://www.ncbi.nlm.nih.gov/BLAST) reports this as
being in pfam04952 (AstE_AspA):

    AstE_AspA
    
    Succinylglutamate desuccinylase / Aspartoacylase family. This
    family includes Succinylglutamate desuccinylase EC:3.1.-.- that
    catalyses the fifth and last step in arginine catabolism by the
    arginine succinyltransferase pathway. The family also include
    aspartoacylase EC:3.5.1.15 which cleaves acylaspartate into a
    fatty acid and aspartate. Mutations in human Aspartoacylase lead
    to Canavan disease. This family is probably structurally related
    to pfam00246 (Bateman A pers. obs.).

but provides no strong matches (best E-value 0.86)

The t04 multiple alignment has 9 pdb sequences, and the t2k one has none,
so this may qualify as fold-recognition, rather than comparative
modeling---the homologs are a bit distant, so alignment will matter a lot.

Top hits with the t06 and t04 w0.5 target models include
	1h8lA
	1uwyA
	1yw6A
	2g9dA
	1yw4A
	2bcoA
	1cpb
	...
The t2k alignment, having no pdb sequences in the multiple alignment,
finds fewer, with top hits
	1yw6A	not in template lib
	2g9dA	not in template lib
	2bcoA
	1yw4A
and a huge increase in e-value before
	1h8lA
	1uwyA

Unfortunately 2bcoA and 1yw4A are not in SCOP, so I can't verify that
they are the same SCOP classification as the others, but VAST has the
most similar structures to 2bcoA being 2bcoB, 1yw6A, 1yw4A, 1jqgA,
1zg7A, 1dtdA, ...  Both 1h8lA and 1uwyA have very similar structures
(pvalues less that 1.e-8).

The 2-track HMMs from t2k favor 2bcoA and 1yw4A highly, with a large
increase in Evalue before 1h8lA, 1qmuA, 1jqgA, 2bo9A, ...

Make started Thu May 18 15:14:25 PDT 2006
Running on lopez.cse.ucsc.edu

The initial run died in the power failure, so I cleaned up the junk
and restarted make.

Thu May 18 15:19:13 PDT 2006 Kevin Karplus

The top hits seem to be 1dtdA and 1m41A, which were not the top hits
with the w0.5 models.  It might we worth doing a blast search of pdb
and record the top few hits.

Thu May 18 18:10:42 PDT 2006 Kevin Karplus

This is the one target for which the blastall crashes.  We need to
install a more recent version of NCBI-blast.

Make started Thu May 18 23:56:52 PDT 2006
Running on cheep.cse.ucsc.edu

The remake that I started on lopez died, because there was a gzipped
empty Template.atoms file.  I'll try running the make again.

Fri May 19 07:57:05 PDT 2006 Kevin Karplus

The try1-opt2 model looks pretty good.
There is an exposed hydrophobic patch that looks like a dimerization
interface at L45-T52.  We may want to pick up the full biological unit
from the homologs and superimpose on that to optimize as a dimer.

There is some confusion about the secondary structure from E284 on
(str2 disagrees with other predictors).  It might be worthwhile to do
a subdomain for P215 to the end, to see if we can pick up any more
signal without the first part.

Fri May 29 11:55:04 PDT 2006 Grant Thiltgen

I checked the SCOP domain and fold for the two top hits for fold
recognition (c.56.5.1), and it has a core of mixed sheets, which the
try1-opt2 model appears to match well.  

I looked at some of the matching PDB files for fold recognition and
the top two matches for the blast hits.  The best score for the fold
recognition (1dtdA) appears to match extremely well.  The second hit 
(1m41A) appears to be a barrel.  The best score for the BLAST search 
(2g9dA) actually appears to match structure with the try1-opt2 and with
the top hit for fold recognition.  I'm a bit concerned with the sheets
trying to fold into a barrel at the end, and one or the other may be 
a construct of the the two different fold recognition results.

I also checked the PDB file for the best fold recognition hit, and I 
can't really tell, but I think it might be a multimer of some kind, but 
I need to check further into it.

Fri May 29 14:35:09 PDT 2006 Grant Thiltgen

Ooops!  I need to tell the difference between a "1" and an "l"  So the
folds of the two fold recognition hits are the same, but I still don't 
really like the end, which is probably a separate domain.  

Sat May 20 08:27:11 PDT 2006 Kevin Karplus

It loks like Grant started doing subdomain predictions yesterday
afternoon (16:06) for S1-F214 and P215-H312, but the P215-H312 one
seems to have crashed before writing out the models, and Grant did not
make the directory writable, so no one else can fix it.  

He also did not direct the output of the make to a log file, so we
can't see for example, why the rr files are missing.
Oops---yes we can.  It looks like the sudomain make is creating
Makefiles that point to the pce/starter-directory instead of the
casp7/starter-directory, so are getting the slightly obsolete Make.main.

I'll have to fix that.

Sat May 20 08:45:42 PDT 2006 Kevin Karplus

OK, I fixed the Makefile and the casp7/scripts/split-into-domains to
use the new Make.main for subdomains.  We have to be careful, as some
of the old Makefiles still point to the old split-into-domains script.

I moved Grant's efforts to Grant-S1-F214 and Grant-P215-H312 and
started new S1-F214 and P215-H312.  (Grant, you can delete your
directories if you want--no one else can as long as they are not
group-writable.) 

Sat May 20 12:38:33 PDT 2006 Kevin Karplus

For P215-H312, 2bcoA is coming up as the top hit, but with an E-value
of 7.5. None of the hits are labeled as containing c.56.5.1 (the
N-terminal domain).

For S1-F214, 1dtdA and 1m4lA are the top two hits, followed by
2bcoA, 1yw4A, 2b09A, 1h8lA, ... .  E-values start around 5.7e-18

The only intersection between the best hits list for the two domains
is 2bcoA, which scores well with both.

It looks like 2bcoA is the right template for the whole protein, but
1dtdA and 1m4lA are better for the N-terminal domain.

Sat May 20 15:32:56 PDT 2006 Kevin Karplus

The P215-H312 try1 run did *not* select the sandwich from 2bcoA.
Instead it seems to have picked a barrel from 1c9oA.

It may be worthwhile to make a chimera with this c-terminal domain and
a good Nterminal model, though I suspect that the 2bcoA template is a
better bet.

Sat May 20 19:01:40 PDT 2006 Kevin Karplus

Looking at the superposition of the best alignments with try1-opt2, it
looks like 2bcoA and 1yw4A have the same C-terminal domain.  The
alignments look better than try1-opt2 in this region.  We may be
getting some messing up from 1h8lA, which has a somewhat different
C-terminal domain in a different place.

Mon May 22 14:51:15 PDT 2006 Grant Thiltgen

I started a run of the whole protein excluding the all-align.a2m
file and using the alignments from only 2bcoA to see if we could
force the protein to follow that template.  

[This must be try2---KJK] 

I have it automatically set up to make things that I create group 
readable, but not writeable.  Is it best to run fixmode when I'm 
finished with a run, or is the makefile supposed to do that when 
it's finished?  Since my files didn't finish running, did that part
of the makefile not run?

I also started a run for P215-H312 using the alignment from only
2bcoA.  I thought I should run it to check if a chimera would be 
a better try than just the whole protein.

[This must be P215-H312/try2---KJK]

Mon May 22 21:47:22 PDT 2006 Kevin Karplus

I picked up all the server results for this target and am scoring the
server predictions with the try1 costfcn.  (I first modified the try1
costfcn to have missing_atoms 1 as a component of the cost.)

There is a fixmode run at the end of the default make, but if a job
terminates early it might not get run, so a manual "fixmode ." is
useful in such cases.  There is probably not a fixmode for specific
makes of non-default targets.

Tue May 23 14:53:52 PDT 2006 Grant Thiltgen

try2-opt2 is slightly better than try2-opt1.  Although for the whole
protein, the logfile wasn't gzipped and there were only the try2-opt1 and
try2-opt2 pdb files, and they also weren't gzipped, so I think the
process crashed somewhere along the way, and I don't know if it will 
rerun from where I left off.  

Also, I found the server predictions, but there doesn't seem to be any 
scoring of them, so I don't know if that crashed somewhere as well.

try2-opt2 is looking at using only alignments from 2bcoA.  

Only looking at P215-H312, try1-opt2 scores better than try2-opt2 for 
this region of the protein.  I'm not sure if that means a chimera might
be better.  I suppose I should try running the first part of the protein
again with just the template from 2bcoA to make sure that combining the 
two might not be better.

I'm going to try another run including 2bcoA and 1yw4A to see if 
maybe it can clean it up a bit.
[This must be try3---KJK]

I also googled Succinylglutamate desuccinylase and it does appear to 
be a dimer, which may mean we need to run it as a dimer next, just to
make sure we have things working right.

Tue May 23 16:42:37 PDT 2006 Grant Thiltgen

GAH!  I'm not sure why, but both the full protein and the first segment
are having difficulties using the alignments from 2bcoA and both 2bcoA
and 1yw4A.  There's an error in the makefile and all the output explodes
to the terminal.

Tue May 23 21:29:05 PDT 2006 Kevin Karplus

I'm remaking decoys/score-all+servers.try1.pretty
I believe it did crash when I tried running it before, but I've made
some fixes to undertaker, so it might be worth trying again.

I'm not sure what Grant means by "the full protein ... are having
difficulties using the alignments ...".  Is there an error message in
the log file, or is it just that undertaker prefers some other alignment?

The correct way to run an optimization run (in either this directory
of a subdomain) is

	(make -k T0289.do2 >& do2.log; gzip-9f do2.log) &

(where the 2 is replaced by the number of the try*.costfcn).

Tue May 23 22:03:31 PDT 2006 Kevin Karplus

No, undertaker is still crashing when trying to read in the SCWRLed
results from UNI-EID_expm_TS1  with messages:
# Trying to read SCWRLed conformation from /var/tmp/from_scwrl_543885382.pdb
undertaker: Segment.cc:95: int Segment::OK() const: Assertion `C_atoms[1] != C_atoms[2]' failed.

I'll have to try debugging that later.  Tonight I have to pack for my
trip to LA.

Wed May 24 14:10:43 PDT 2006 Grant Thiltgen

Okay.  I started to re-run some of the undertaker runs that crashed.  
I think they might be working okay now, but only time will tell!

Thu May 25 13:50:24 PDT 2006 Grant Thiltgen

Everything looks like it ran okay.

For the full protein try2-opt2 still does better than try3-opt2.  
try1-opt2 still does best for P215-H312.  

I'm gonna mess around with dimerizing try2-opt2 to see what happens.

Thu May 25 14:56:47 PDT 2006 Kevin Karplus


I need to fix the undertaker crash reading in SCWRLed UNI-EID_expm_TS1.

We may need to increase break weights and do a polishing run from
multiple existing models.

Thu May 25 15:00:36 PDT 2006 Kevin Karplus

I changed superimpose-best.under to include the single-domain
predictions in the superposition, to see if we want to create chimeras.

Thu May 25 16:02:33 PDT 2006 Grant Thiltgen

So I was messing around with creating a dimer, and unfortunately the 
templates used to make the protein don't create dimers in the same 
location as the dimer should be for this protein, so I'm not sure if
creating a dimer is going to work well for it.

Hmm.  try2-opt2 scores better for S1-F214, but it puts the end that
I would need to create the chimera and the wrong end of the protein,
so I'm not sure how well that is actually doing.  try1-opt2 does it too.
I may be just sticking with the entire protein.

Fri May 26 15:22:27 PDT 2006 Grant Thiltgen

So far try2-opt2 for the whole protein works best, so I'm going to try
an optimization run to get rid of some of the breaks.  

Fri May 26 18:47:40 PDT 2006 Kevin Karplus

The S1-F124 try1-opt2 matches the whole-chain try2-opt2 very well out to
about F204.

The P215-H312 try1-opt2 does not match at all to the whole-chain
try2-opt2.  I wonder if it is more or less sensible than the
whole-chain prediction.

Mon May 29 13:18:54 PDT 2006 Kevin Karplus

The problem that is crashing undertaker is with conformations that
have two atoms in exactly the same place (an O and an N in the next
residue at the same spot in the current crash).  I think that the
problem is in the input files, but I thought that I detected that
problem and marked the offending atoms.  I *was* only looking for
identical adjacent atoms, though and O and N aren't adjacent, so if
the intervening C is different, ... . Another possibility is that some
transformation caused the missing-atom bit to be lost.

Tue May 30 12:57:53 PDT 2006 Grant Thiltgen

I tried upping some of the burial restraints in the first line to
make the protein less foamy.  It also looks like the new model works
well, but there are still some chain breaks.  Is that going to be okay 
for submission, or should I work on getting rid of them?

The main problem I have with the P215-H312 try1-opt2 is that P215 is
slightly inaccessible to link up to the other half.  Also the end of 
the protein that needs to be linked up is going off on the wrong
direction.  It might be a bit late, but would it be beneficial to try 
a run from residues 1-200, then 211 on?

Tue May 30 14:33:39 PDT 2006 Grant Thiltgen

I tried running do5, but the current makefile is set up to run the
buggy version of undertaker.  Can the current makefile be temporarily
changed in order to run the non-buggy version of undertaker?

Wed May 31 14:46:21 PDT 2006 Grant Thiltgen

I ran try5 with increased weights for wet6.5, dry5, dry6.5, and dry8.
The scores slowly seem to get better.  The second half of the molecule
still seems really foamy, so I pulled the conserved residues for just 
that part of the molecule with the P215-H312 output.  There wasn't any
in the hole in that part, but there were some within the groove so it 
leads me to believe that the results are mostly good.  There are some
conserved residues sticking out in to the solvent area, which may or 
may not need to be moved.

I also started runs for new domains for the first 200 residues and the
rest of the protein, just to check to see if that makes a better chimera
than trying to mix the other split we made.

I might try another run of the whole protein with the sidechain results
increased. [That must be try6---KJK]

Thu Jun  1 07:37:56 PDT 2006 Kevin Karplus

It is past time to clean up the constraints in the costfcn---Grant is
still running with the automatically generated constraints.

Also, we need to add missing_atoms to the costfcn for scoring the
server models.

try6-opt2 looks good out to about I219.
I'm still a bit dubious about the C-terminal domain.

Thu Jun  1 08:58:11 PDT 2006 Kevin Karplus

Fixing undertaker to handle such messed up inputs as servers/UNI-EID_expm_TS1
looks difficult, so I have tried commenting it out of the
read-pdb+servers.under file to try to get scoring for the other files.

Thu Jun  1 09:16:44 PDT 2006 Kevin Karplus

Foo! now undertaker is dying trying to score
	karypis.srv.4_TS1-scwrl

Is SCWRL returning duplicate points???
Or is undertaker messing up on reading back SCWRL results?

OK, the karypis.src.4_TS1 backbone is ugly, but it shouldn't be
killing undertaker.

Thu Jun  1 09:36:55 PDT 2006 Grant Thiltgen

I guess I'm not really sure which constraints to add or take out of 
the costfcn to change and improve the model.  

I agree with the C-terminal domain problem.  Unfortunately, the groove
where the active site is probably located overlaps with the good and 
bad parts of the protein.

Thu Jun  1 14:23:20 PDT 2006 Grant Thiltgen

I talked with George about some of the constraints.  He gave me some 
good suggestions on how to look for which constraints to add for sheets.
I'm running try7 without the sheet constraints or the rr constraints to
see what may happen with the C-terminal region.  George seems to think
that the first 200 residues are matching up well enough that removing
the constraints should be okay.  I'm also going to set up try8 with
the original sheet constraints and some new constraints for some sheets
that look like might be in the C-terminal region based on the ehl2 
composite information.

Fri Jun  2 10:49:00 PDT 2006 Grant Thiltgen

try7 seems a bit worthless, and try8 didn't turn out all that great, but
I think I can tweak some of the constraints from try8 and work on the 
C-terminal region of the protein.  I started try9 with some new strand 
and sheet constraints.
Make started Fri Jun  2 15:55:03 PDT 2006
Running on vashon.cse.ucsc.edu

Sat Jun  3 17:05:54 PDT 2006 Kevin Karplus

I made decoys/score-all+servers.unconstrained.pretty
The top scorer is try6-opt2.
Other than SAM_T06_server_TS1, the next highest scorers are
ROBETTA_TS[25341], then RAPTORESS_TS1-scwrl.
The score change is pretty large down to raptoress.

Sun Jun  4 17:22:01 PDT 2006 Grant Thiltgen

I ran try4 for P215-H312, and I'm not too sure it's all that great, but
I can try working with it for a bit.  I realized that I'm still running
with just the two top models, so I am going back to getting more
fragments to model the end with the modified costfcn.

I also tried to make some new constraints for try11 with the n_notor and
o_notor results.  

Mon Jun  5 11:21:08 PDT 2006 Grant Thiltgen

That C-terminal domain is gonna drive me nuts.  Undertaker really 
likes to do the same thing with the end even when I remove the 
constraints and add new ones in.  Blah.  try12 will attempt to manually 
define all the strands and helixes and see how undertaker wants to pack
it.

Mon Jun  5 15:52:21 PDT 2006 Grant Thiltgen

try12 isn't really improving much.  The last few runs I ran all kind
of do the same thing and score similarly.  I can't seem to get the 
residues to make a helix when I want them to.  Blah.  I'm going to try try6
for just the C-terminal with the same constraints for the helices and see
if that helps.
 
Tue Jun  6 09:40:31 PDT 2006 Kevin Karplus

I'll try making decoys/score-all+servers.try12.pretty to see if any of
the servers are doing what Grant wants the protein to do.

The problem may be that he has put a large weight on the sheet
constraints and a tiny weight on all other constraints, so that the
other constraints are almost irrelevant. He also had typos in some of
the added constraints (reported as errors when the constraints were
read, but Grant must not have checked for error messages).  Making
decoys/score-all.try12.pretty with the output to an emacs buffer or a
file is one fairly quick way to see the error messages for the try12
costfcn.

Looking at try12-opt2, the helix L85-F99 is upside down, with the
buried face exposed.  The conserved E88 should probably be in the
active site with the other conserved charges, perhaps with E88.OE1
near N53.ND2 (though probably not close enough to hbond).
The CB atoms of A92 and I95 should probably be near (say <7 Ang) the
CB atoms of F50 and F16.

For try13, I'll try adding constraints to orient this helix properly
and adjust the sheet constraints to be less powerful.

Tue Jun  6 10:10:01 PDT 2006 Kevin Karplus

I threw out  decoys/score-all+servers.try12.pretty, because
try12.costfcn did not include missing_atoms, so would be highly
misleading about servers that gave incomplete results.  (Despite that,
none of the servers except robetta scored well.)

I am making a decoys/score-all+servers.try13.pretty, which will
probably favor try11-opt2 a lot, since the sheet constraints were
taken from there.

Tue Jun  6 10:28:45 PDT 2006 Kevin Karplus

The best-scoring model with try13 is indeed try11-opt2, and
SAM_T06_server_TS1 is the best-scoring server model, ROBETTA_TS5 next
(way down the list) and RAPTORESS_TS2-scwrl after that.

Oops, I have to remake decoys/score-all+servers.try13.pretty, since I
did not notice Grant's typos in the constraints he added.  I've fixed
them in try13, and will rescore the server models.

Looking at the superposition in best-models.pdb.gz, based on
    ReadConformPDB T0289.try11-opt2.pdb
    ReadConformPDB T0289.try7-opt2.repack-nonPC.pdb

    InfilePrefix P215-H312/decoys/
    ReadConformPDB T0289.try1-opt2.pdb
    InfilePrefix A201-H312/decoys/
    ReadConformPDB T0289.try1-opt2.pdb
    InfilePrefix S1-F214/decoys/
    ReadConformPDB T0289.try2-opt2.pdb

    InfilePrefix decoys/servers/
    ReadConformPDB SAM_T06_server_TS1.pdb
    ReadConformPDB ROBETTA_TS5.pdb
    ReadConformPDB RAPTORESS_TS2.pdb

    InFilePrefix 
    ReadConformPDB T0289.undertaker-align.pdb model 1
    ReadConformPDB T0289.undertaker-align.pdb model 2
    ReadConformPDB T0289.undertaker-align.pdb model 3
    ReadConformPDB T0289.undertaker-align.pdb model 4
    ReadConformPDB T0289.undertaker-align.pdb model 5

I see that there is general agreement about the sheets *except* for Y287-T297.
Perhaps I should remove the sheet constraints for that region, and
just use Strand constraints, to avoid biasing the model selection too much.

Tue Jun  6 11:13:43 PDT 2006 Kevin Karplus

OK, I modified try13.costfn, rescored everything with it, and am now
running try13 (from the original alignments, like try1) on cheep.

Tue Jun  6 12:35:59 PDT 2006 Grant Thiltgen

I ran try6 and try7 for just the C-terminal region.  I think try7 might 
be something I might be able to work with a chimera.  We'll see.  Also, 
I ran try2 with the the A201-H312 region to see if a little bit of overlap
might help, and it appears that I might be able to try to make another 
chimera from that.  

Tue Jun  6 14:31:43 PDT 2006 Kevin Karplus

try13-opt2 scores almost as well as try11-opt2.  The constraints are
not quite so well met (not surprising, since the constraints were
taken from try11-opt2), but the other scores a better.  The difference
in total cost is less than the differences in constraints.

Rosetta still doesn't like repacking this model.  It prefers
try7-opt2, which I think looks terrible.  Somewhat surprisingly,
try13-opt2.repack-nonPC scores better than try13-opt2

Grant does not seem to have put his chimeras into the decoys directory
(where they belong), so they are not getting scored in the score-all
scripts. 

Tue Jun  6 14:41:10 PDT 2006 Grant Thiltgen

Gah!  I can't seem to get the two models I want to superimpose to 
superimpose at residue 215 in order to create a model that works well.
I'd like to be able to get them to superimpose there, and maybe use 
protein shop to remove some clashes before polishing it up with undertaker,
but when I use the command
	PrintAllConformPDB make-chimera.pdb atom P215 superpose
in the undertaker script that I used "make-chimera.under", the residues
are still far apart.  I also can't move them closer together in 
ProteinShop (I tried).

Tue Jun  6 15:07:16 PDT 2006 Grant Thiltgen

Well, I got it to overlap at around residue 218 and 219 instead of 216, 
so I'll try to chop the protein there and see what I can do with it.  

Tue Jun  6 15:38:01 PDT 2006 Grant Thiltgen

I'm going to run try14, which is attempting to work with the first chimera
I made.  I'm hoping to go through an optimization run to see how well 
undertaker can work with the chimera I made:
	T0289.try11-opt2-chimera-try7-opt2-C-terminal.pdb


Tue Jun  6 15:55:22 PDT 2006 Kevin Karplus

The reason that using undertaker to create the chimera was not working
well is two-fold:

1) the atoms specified just give an initial superposition of the two conformations,
   which are then reoptimized to overlap as well as possible.  So the
   overlapping regions of the two predictions can override the initial superpostion.
   
   This can be reduced by truncating the predictions so that only the
   residues used for aligning the domains are left.  (I did half this
   job in try11-opt2-chopped-to-C217.pdb
   
2) the connection that Grant was specifying (around P216) would result
   in the two domains colliding badly.  So even if the splicing was
   done right, the resulting chimera would be a mess.

What happens if we try superposing D220, V221, Y222, K223?

Tue Jun  6 16:16:12 PDT 2006 Kevin Karplus

D220, V221, Y222, K223 makes a pretty crummy crossover, but Q247,
D248, Q249 looks like it might work.

Tue Jun  6 16:21:53 PDT 2006 Kevin Karplus

Nope, bad idea.

Probably the best thing to do at this point is for Grant to work with
Firas on using ProtienShop to place the second domain relative to the
first where he wants it---the structures are too different to make an
easy crossover just by lining up a fragment.

At this point, probably the most valuable thing to do is to polish up
thr try11/try13 line to close gaps and remove clashes.

Tue Jun  6 16:30:36 PDT 2006 Kevin Karplus

Grant's T0289.try11-opt2-chimera-try7-opt2-C-terminal.pdb    
is intriguing, and worht pursuing, but I don't think that try14 will
clean it up.  I think that some sheet constraints are needed to hold
the two domains together as undertaker tries to close the gaps.

Still, we can judge that when try14 is half-finished and has produced
a try14-opt1 model.  Nope, Grant read in ALL the models, not just his
chimera, so he'll end up polishing something else--probably try6.

Tue Jun  6 16:38:05 PDT 2006 Grant Thiltgen

Whoops!  Sorry about that!  I still don't realize what needs to be 
commented out sometimes.  I can start try15 with some sheet constraints
to hold in the gaps, and make it so it only works with the one
pdb file.


Tue Jun  6 17:52:21 PDT 2006 Kevin Karplus

try15.costfcn doesn't actually have any constraints:

# SheetConstraint Error: residue specified as P151 doesn't match (T0289)C151
# Error: can't parse residue name in position0

I have found it useful to do a 
	make decoys/score-all.try15.pretty
after creating a new costfcn, to make sure I don't have any typos like
this in the constraints.

Tue Jun  6 21:39:03 PDT 2006 Grant Thiltgen

Ah!  I see what happened.  When I saved the PDB file from protein shop
it renumbered them starting with 2.  All the numbers are off by one.  
I went ahead and fixed that and I'm running a new run try16 with the 
constraint to hold the two strands together.  I'm also including some 
of the other old constraints to keep it from getting to crazy as well.

Wed Jun  7 11:47:38 PDT 2006 Grant Thiltgen

try16 didn't seem a whole lot better than try15.  I'm starting try17 to 
see if I can get that sheet between the two domains to form.

Wed Jun  7 17:18:32 PDT 2006 Grant Thiltgen

try17 is finished.  It really doesn't seem much different than try 16.  
I'm not sure that the chimera is the way to go.  It may be better to try
to use some of the better models from fold-recognition (try11, try13, try6) 
than pursue this.  It seems even foamy than before, and I'm not really 
sure how to get that sheet to line up better.  I'll give it another go 
this afternoon though before I make sure it's not going to work.  I'm 
also going to try to pull in the helices that seem to stick out.

I'd also like to clean up some of the breaks in the chain in some of the 
models, but I'm not sure how to get undertaker to do that.  I've tried
increasing the weight of breaks in the costfcn, but I'm not sure if that
actually works.

Wed Jun  7 23:54:54 PDT 2006 Grant Thiltgen

try18 finished, and the sheet still isn't right.  I don't know what else
I can do to fix that.  The runs based on the chimera (try18, try17, try16, 
and try15) seem to score better using the unconstrained cost function, but
they don't seem to be that much improved over try6 (which is a refined model
of the early stuff direct from undertaker.)  Maybe the chimera is slightly
better, but I'm not incredibly sure if it's better or how to improve it 
with the sheets and the dangling helices at the end.  

I'm also running try19 to try and refine try11 and try20 to refine try13.

Thu Jun  8 09:35:04 PDT 2006 Grant Thiltgen

I'm starting one more run on the chimera to try to get that sheet where
it should be:  try21.  I'm taking out all constraints except the one for
the sheet, which I'm gonna turn up the weight on a bit.  After polishing
the other two models with try19 and try20, the unconstrained costfcn still
has try18 at the top.

Thu Jun  8 14:02:45 PDT 2006 Grant Thiltgen

try21 appears to gets the sheet made, but it tore apart part of the 
original sheet.  I started try22 which used the template from try18 with 
the original sheet constraints.  I'm also planning on running try23 with
the new template when try21 finishes to try to repair ths sheet.  

The chimeras are still scoring well with the unconstrainted costfcn, but
I'm still not completely sure that it is better than the targets before
the chimera.


Thu Jun  8 16:52:21 PDT 2006 Kevin Karplus

try21-opt2 looks a tiny bit better than try18-opt2, but actually
scores worse with unconstrained.costfcn.

We'll submit
ReadConformPDB T0289.try6-opt2.pdb
ReadConformPDB ROBETTA_TS5.pdb
ReadConformPDB T0289.try21-opt2.pdb
ReadConformPDB T0289.undertaker-align.pdb model 1
ReadConformPDB T0289.undertaker-align.pdb model 4

I'm running a polishing run (try23 on cheep) to fix up the ROBETTA models.
Grant will do a polishng run on try6-opt2 (try24).
The polished versions will replace the others when done.

Fri Jun  9 07:40:21 PDT 2006 Kevin Karplus

I replaced the ROBETTA model last night and will replace try6 with
try24 this morning, resulting in 

     ReadConformPDB T0289.try24-opt2.pdb
     ReadConformPDB T0289.try23-opt2.pdb
     ReadConformPDB T0289.try21-opt2.pdb
     ReadConformPDB T0289.undertaker-align.pdb model 1
     ReadConformPDB T0289.undertaker-align.pdb model 4

Fri Jun  9 07:45:43 PDT 2006 Kevin Karplus

email submission done.


Date: Thu, 8 Jun 2006 20:52:25 -0700 (PDT)
From: Grant Thiltgen
To: Kevin Karplus
Subject: undertaker runs finished

try22 (which is another version of try18) and try24 (the refinement of 
try6) are finished.  I remade score-all.unconstrained.pretty to check 
out the scores.  It appears that try22 is slightly better than try18, 
but not much, so try18 is probably okay to go with.  Also, try24 is 
slightly better than try6.  I'm not sure it's much of an improvement, but 
we can submit that one instead.

G.

------------------------------------------------------------

Fri Jun  9 10:06:49 PDT 2006 Kevin Karplus

OK, try21-opt2 will be replaced by try22-opt2, since both undertaker
and rosetta like it better.

DONE.  Submissions are now:

     ReadConformPDB T0289.try24-opt2.pdb
     ReadConformPDB T0289.try23-opt2.pdb
     ReadConformPDB T0289.try22-opt2.pdb
     ReadConformPDB T0289.undertaker-align.pdb model 1
     ReadConformPDB T0289.undertaker-align.pdb model 4

Tue Jul 11 11:42:44 PDT 2006 Kevin Karplus

The REAL_PDB file is 2gu2A.

I'm running an evaluation of the servers and our models.
It looks like our server beat our manual predictions, and that
try23-opt2 made ROBETTA_TS5 worse, not better.  try22-opt2 is slightly
better than try21-opt2, but try22-opt2.gromacs0 is better still.
try24-opt2.gromacs0 would have improved on try24-opt2, but still not
gotten it to the level of SAM_T06_server.  The best server model
appears to be RAPTORESS_TS1, with our server model about 7th among the
TS1 models:
	RAPTORESS_TS1
	Zhang-Server_TS1
	FAMSD_TS1
	SP4_TS1
	SPARKS2_TS1
	ROBETTA_TS1
	SAM_T06_server_TS1

None of our hand tries were as good as our server model.  Our best was
probably try8-opt2.gromacs0, with an all-atom RMSD of 8.2 and GDT of
37%.  The best we submitted was model 2 (8.8 all-atom RMSD and GDT of
38.1% based on ROBETTA_TS5, which did better).  RAPTORESS_TS1 had 7.0
RMSD and GDT of 42.8%

I think I need to put some weight on GDT and smooth_GDT and reduce the
weight on missing_atoms.  Looking just at GDT, the best model is
RAPTOR_TS2 (44.4%) and SAM_T06_server is the 27th TS1 server.
This may be a more valid indication of how it is doing on the modeling
than the RMSD-heavy evaluation I've been using.

Fri Jul 14 11:00:48 PDT 2006 Kevin Karplus

The decoys/evaluate.unconstrained.pretty file shows both the
undertaker costs (unconstrained) and real costs, which combine clens,
log_rmsd, log_rmsd_ca, GDT, smooth_GDT, and missing atoms.  The
mising_atoms weight in the real cost is misreported in the header,
since it appears in both cost functions with different weights.

RAPTOR_TS2 does the best, but of the TS1 server models, RAPTORESS_TS1
comes out on top.  SAM_T06 is 18th of the servers for TS1
models---adequate, but not great.  GDT values are only around 40%,
with RAPTOR_TS2 getting 44%, so no one nailed this one.
 
Our best model is try23-opt2.gromacs0, which we did not submit, but
our model2 (which was try23-opt2) is almost as good.