Mon Jul  7 09:25:00 PDT 2008
T0487
Make started Mon Jul  7 09:25:26 PDT 2008
Running on cheep.cse.ucsc.edu

Mon Jul  7 11:27:36 PDT 2008 Kevin Karplus

T0487 is a long protein (685 residues) and so will almost certainly
have to be broken into domains.  We are getting good hist to both
b34.14.1 and c.55.3.10 domains, and the two top hits (1yvuA and 1u04A)
have both of those domains, so maybe this is a full-length homology
model.

Since 1u04A is a full-length Argonaute from Pyrococcus furiosus and
the target is the Thermus termophilus Argonaute, it seems likely that
this is a complete match.  The domains are a PAZ domain and a PIWI
domain. 


Tue Jul  8 02:34:47 PDT 2008 Kevin Karplus

try1-opt3 has some rather bad breaks.  One group are in the area
H621-E629, and probably require peeling the last strand off the
sheet. 

There are numerous other bad breaks, so I'll reoptimize with breaks
(and soft_clashes) turned up by a factor of 4.  The try1 optimization
took 9 hours, so I'll reduce the number of iterations for try2, and
try to focus the operations more on break reduction.  I'll also add
the superfamily alignment reading and remove tha all-align alignment.
There will probably be too much time spent running scwrl, so I might
have to turn to the noscwrl scripts for future runs.  (I really need
to implement saving and restoring alignment and fragment libraries!)

Tue Jul  8 10:17:03 PDT 2008 Kevin Karplus

try2 didn't run, because I had a typo  (1uo4A for 1u04A), so my
getting up in the middle of the night to start the run was wasted.
I'll start it again on peep.

Tue Jul  8 12:13:56 PDT 2008 Kevin Karplus

try2 failed with an assertion failure!
undertaker: ../ultimate/src/Transform/Transform.h:83: bool Transform::OK() const: Assertion `1.-ERR_LIMIT < rot_mag2 && rot_mag2 < 1.+ERR_LIMIT' failed.

I've never seen that assertion failure before---I wonder what
triggered it.  I'll probably have to do some debugging, but it ran for
about an hour before crashing, so debugging will be difficult.

I think I'll first try just running again and hoping for a different
seed to avoid the bug.  I'll save the log of the failed run, so I can
restore the seed 1216450465 and (with luck) replicate the error.

Tue Jul  8 13:07:25 PDT 2008 Kevin Karplus

The SAM-T02-server and SAM-T06-server runs failed because of the
length limitations

I restarted the SAM-T02-server run in
	/projects/compbio/tmp/target02-query/target02-query-1215447404-4626
but I'll probably have to submit the MODEL[1-5].al files manually,
since the mail in that old server is not in the Makefile.

Tue Jul  8 18:44:27 PDT 2008 Kevin Karplus

The SAM-T02-server models were submited and accepted.
The SAM-T06-server run is still going forward---it may take until the deadline!

The try2 run closed gaps and removed clashes somehwat, but still more
is needed (the try2-opt3.gromacs model scores better with the try2 costfcn).

Tue Jul  8 23:37:54 PDT 2008 Kevin Karplus

There seem to be substantial phase differences between try1+2+3 and
the alignment to 1u04A.  Perhaps I should break this into domains,
work on each domain separately, then put them back together by
superimposing on the whole model.

I might be able to pull out R25-L98, I173-P263, G323-V463, G459-V685.

 
Fri Jul 11 17:20:41 PDT 2008 Kevin Karplus

MQAC likes
    Zhang-Server_TS3 0.502
    Zhang-Server_TS4 0.497
    pro-sp3-TASSER_TS3 0.494
    Zhang-Server_TS2 0.482
    GS-KudlatyPred_TS1 0.477
    MUSTER_TS1 0.477
    GS-KudlatyPred_TS2 0.475
    GS-KudlatyPred_TS3 0.474
    circle_TS2 0.470
    circle_TS3 0.470

MQAU likes
    SAM-T08-server_TS1 0.580
    SAM-T08-server_TS2 0.579
    RAPTOR_TS1 0.577
    GS-KudlatyPred_TS1 0.569
    GS-KudlatyPred_TS3 0.568
    GS-KudlatyPred_TS2 0.567
    MUSTER_TS1 0.567
    BAKER-ROBETTA_TS3 0.565
    pro-sp3-TASSER_TS3 0.565
    BAKER-ROBETTA_TS4 0.565

Mon Jul 14 10:50:44 PDT 2008 Kevin Karplus

The best scoring models with try1.costfcn are MQAU1, MQAC1, try3-opt3.repack-nonPC.
The MQAU1 and the MQAC1 models come from GS-KudlatyPred_TS2.
They also score well with try3.costfcn and rosetta (at least the
gromacs0.repack-nonPC versions do.


Mon Jul 14 11:15:59 PDT 2008 Kevin Karplus

In a number of ways I like try3 better than MQAU1.
Perhaps I should do some patching of its bad points, though, to fix
things up.  For example, The bad breaks at 
	P44, L45, L46 could be patched with D34-Q49,
	A623,A632 could be patched with H621-A632,
	G482, R482, E483, S484 with G480-G489,
	E76, G77, T78, L90, Y91, P100, K101 with W67-P103
	F338?
	H382?
	H445, R446, W447?
	G499 and H500?
	G126, V127, W128?
	P26,W27?
	C175, E176?
	K329?
	R199,R200?
	A278?
	H256, L260, L261, V262?
	L148, G149?
	A589?
	W437?
	V666?

This will take forever to optimize, unless I break it into domains.

Tue Jul 15 16:13:41 PDT 2008 Kevin Karplus

I started subdomain predictions for R25-L98, I173-P263, G323-V463, G459-V685.
When they are done, I'll chop out the corresponding parts of try3 and
MQAU1 and try optimizing each domain separately with high crossover.

Tue Jul 15 16:44:56 PDT 2008 Kevin Karplus

These subdomains seem to be too small to pick up the signals that lead
to the recognition of 1u04A, 1w9hA, 1si2A, and 1r4kA (c.55.3.10 and
b.34.14.1 families).  The G459-V685 does get good hits on the c.55 fold:
1yvuA, 1u04A, 1w9hA, and (weakly) 2fsjA.

I'm sending the whole try3-opt3 model to VAST, to pick out there the
structure comes from. 
Your VAST Search job was submitted at 07/15/2008
19:45:09(EDT). Request ID: 360713589810141423

Tue Jul 15 17:20:59 PDT 2008 Kevin Karplus

The best whole-length VAST hits are to 1yvuA and 1z26A, covering
residues E8-F683.  

1ytuB covers K320-G670
1z26A covers S327-E483

For subdomains identified by VAST: 
	 1-12, 303-331, 571-626		1z26A, 1yvuA
	 13-21, 125-174, 273-302	2qt7A, 2cxcA, 2v7bA
	 175-199, 257-272	all too short
	 P26-R120	1z26A, 3cjtA, 3cjsA
	 E203-H256	2ekmA, 1rlhA, 1tr8A
	S328-E438	1z26A, 1ytuB, 2p9hA, 1zrhA
	R446-586,630-F683	1z26A, 1yvuA, 1asuA, 1ekeB, 1ilyA, 1wn1A, 2qh9A

Tue Jul 15 17:39:52 PDT 2008 Kevin Karplus

I started a subdomain prediction for G302-V685, which should be a
whole fold except for one strand that comes from M1-N12.

Tue Jul 15 17:43:56 PDT 2008 Kevin Karplus

I also started N12-P306 for the first half of the protein. 

Tue Jul 15 23:29:16 PDT 2008 Kevin Karplus

The 4 shorter subdomain runs are done.

R25-L98 didn't find any strong hits (1o59A at E-value 65)
	The R25-L98/try1-opt3 model looks plausible, but no better
	than the regions cut out of try3-opt3 and MQAU1-opt3

I173-P263 also has no strong hits (1rmwA at E-value 27.8)
	The I173-P263/try1-opt3 model is not even as reasonable as the
	regions cut from try3-opt3 and MQAU1-opt3

G323-V463 has a weak hit to c.55.3.10 domains (1w9hA E-value=1.29,
	1u04A Evalue=1.426). The try1-opt3 model agrees fairly well
	with the regions cut from try3-opt3 and MQAU1-opt3.
	It may be worthwhile to try doing an optimization in this
	subdomain, to get the loops better solved.
	
	Tue Jul 15 23:43:23 PDT 2008 Kevin Karplus
	G323-V463/try2 started on the moai cluster,
	with sheet and helix constraints from G323-V463/try1
	
G459-V685 has strong hits for c.55.3.10 domains (1u04A and 1w9hA < 1e-25),
	and weak hits for other c.55.* folds.
	The different predictions are in good agreement.  In fact, the
	from-MQAU1 model scores better than the G459-V685/try1-opt3
	model with the try1 costfcn.
	
	Tue Jul 15 23:52:04 PDT 2008 Kevin Karplus
	G459-V685/try2 started on the moai cluster.

N12-P306 has moderate hits to 1u04A (2.4e-04), but has not finished
	building its try1 model yet.
	
G302-V685 has strong hits to c.55.3.10 domains (1u04A 1.8e-40, 1w9hA 3.6e-36)
	but hasn't finished building its try1 model yet. 
	I expect this model to be close to the ones for the
	whole-length protein, with perhaps slightly better loops.

Thu Jul 17 12:03:52 PDT 2008 Kevin Karplus

The N12-P306 and G302-V685 models do look pretty good.
I should do a little clash and break removal in each, and then make a
chimera, superimposing them on either MQAU1 or try3.

First I should re-extract sheets from them, as I've done some bug
fixes to the algorithm to make it extract sheets that are a little
more messed up.

Thu Jul 17 12:15:37 PDT 2008 Kevin Karplus

try2 runs started in both N12-P306 and G302-V685.

Thu Jul 17 16:25:01 PDT 2008   SAM-T08-MQAO hand QA T0487 Submitted
Thu Jul 17 16:25:01 PDT 2008   SAM-T08-MQAU hand QA T0487 Submitted
Thu Jul 17 16:25:01 PDT 2008   SAM-T08-MQAC hand QA T0487 Submitted

Thu Jul 17 16:35:58 PDT 2008 Kevin Karplus

The N12-P306/try2 and G302-V685/try2 runs both reduce breaks and
clashes, but the gromacs0 optimization improves them further, so
clearly more clash reduction is needed.  The sheet scores are not
great for either one, though better than the try1 runs.

I'll try combining them to make an N2-C2 chimera, but which model
should I use as a template to align the domains?  Perhaps whichever
scores best with the sheets from N2 and C2?

Thu Jul 17 16:48:15 PDT 2008 Kevin Karplus

I made a try4 costfcn with N2 and C2 sheets and helices, and
MQAU1-opt3.gromacs0.repack-nonPC scores best, so I'll use that as the
template. 

It looks like I should take 
	M1-L121	MQAU1-opt3.gromacs0.repack-nonPC 
	R122-E176	N12-P306/try2-opt3.gromacs0
	
	
There is a bad break after Y135, and I may want to add some sheet
constraints for tucking L124-Y135 into place.

	I don't like *either* model for the next region.
	I could try
	A170-T201	N2
	W202-S280	MQAU1

I have to continue with MQAU1 to get a connection between the domains,
and MQAU1 looks better for a while
	-L465		MQAU1
	S466-R574	C2
	K575-P583	MQAU1
	V584-D660	C2
	R661-V685	MQAU1
	
Putting this together gives me 	
	M1-L121		MQAU1-opt3.gromacs0.repack-nonPC
	R122-T201	N12-P306/try2-opt3.gromacs0
	W202-L465	MQAU1
	S466-R574	G302-V685/try2-opt3.gromacs0
	K575-P583	MQAU1
	V584-D660	G302-V685/try2-opt3.gromacs0
	R661-V685	MQAU1

I'll modify try4.costfcn to include MQAU1 sheets and helices as well.

Thu Jul 17 18:05:18 PDT 2008 Kevin Karplus

I started try4 to attempt to close gaps in chimera-N2-MQAU1-C2.

Thu Jul 17 21:24:42 PDT 2008 Kevin Karplus

try4-opt3 does not score quite as well as MQAU1 and MQAC1, at least
not in the gromacs optimized versions of each, mainly because it still
has some bad breaks.  I'll do a polishing run on try4 to see if I can
best the MQAU1 and MQAC1 runs.  It will start from all SAM+undertaker
models, but not the MQA models.  

Fri Jul 18 08:03:28 PDT 2008 Kevin Karplus

try5-opt3 scores better but still has some bad breaks.
I still don't like any of my models for I173-P263.
Perhaps I should optimize that domain separately, adding constraints

I173.CA	V262.CA		7.44
L174.CA	V262.CA		6.64
C175.CA	V262.CA		7.09

I173.CA	P263.CA		5.41
L174.CA	P263.CA		6.74
C175.CA	P263.CA		8.55

so that it can be put back in.

Fri Jul 18 08:27:45 PDT 2008 Kevin Karplus

I started I173-P263/try2 with an inconsistent set of constraints that
favors the model from-try5 a little.
I also started I173-P263/try3 with the same costfcn, but starting from
alignments. 

Fri Jul 18 08:43:41 PDT 2008 Kevin Karplus

Perhaps more to the point, I started I173-P263/try4 with constraints
taken from the alignment of that region to 1u04A, starting from
alignments. 

Fri Jul 18 08:46:56 PDT 2008 Kevin Karplus

Looking at other parts of the try5 model:  I wonder if I am missing a
meander for K320-R335.  There seems to be one almost forming.
1u04A DOES have an extra strand there, with i,i+8 hbonds.
I should add a sheet constraint, but I'm a little confused about which one

SheetConstraint	P319	M322	K329	A326	Hbond	V327 ?

The i,i+8 pairing would have L321 with K329, which would require some
remodeling of the try5 model, but this looks a bit more like 1u04A:

SheetConstraint K320	G323	P330	V327	Hbond	L321


L123-R136 is still messed up, with a bad break before E138.
I need to look at what happens in that region in 1u04A.
Ah---it forms a helix for roughly E125-A133.  I should add that as a constraint!

HelixConstraint E125	A133

Fri Jul 18 10:31:02 PDT 2008 Kevin Karplus

Oops---the do2 and do4 runs were started in the wrong directory
(I173-P263/decoys/ instead of I173-P263/), so I've restarted them in
the correct directory.

Fri Jul 18 11:42:18 PDT 2008 Kevin Karplus

The I173-P263/try4 run does not get the sheets I expected.
I'll try again as try5, but turn the sheet constraints up and the
helix constraints down.

Fri Jul 18 13:16:41 PDT 2008 Kevin Karplus

I173-P263/try5 also did not pick up the 1u04A alignment

Fri Jul 18 13:26:13 PDT 2008 Kevin Karplus

No wonder!  I had asked for the sheets from model 8 of 
I173-P263/T0487.undertaker-align.pdb, but that's 1jiwI.
I wanted model 5 (1u04A).  Let me try again with the RIGHT constraints.

Fri Jul 18 14:42:26 PDT 2008 Kevin Karplus

oops, forgot to create I173-P263/try6.under, so it didn't run.
Starting it again.

Fri Jul 18 19:36:39 PDT 2008 Kevin Karplus

I173-P263/try6 is based on 2dtrA+1xd5A+2uubT+1fjgT
and is still not really getting the sheets I expected.
Ah---model5 is from 1u04A, but is too short an alignment to contain
the desired sheets!

Fri Jul 18 20:01:22 PDT 2008 Kevin Karplus

For I173-P263/try7, I'll get constraints from the whole try3 model,
the whole alignment to 1u04A and the N12-P306 alignment to 1u04A, with
the most weight on the N12-P306 alignment.

Fri Jul 18 21:18:01 PDT 2008 Kevin Karplus

I173-P263/try7 did not produce a clean model, so I'll do try8, try9,
and try10, each with only one of the three sets of constraints, to
avoid conflicting constraints.

I173-P263/try8	align1 from whole chain
I173-P263/try9	align1 from N12-P306/
I173-P263/try10	 try3 from whole chain

Sat Jul 19 13:56:50 PDT 2008 Kevin Karplus

None of I173-P263/try8, I173-P263/try9, I173-P263/try10	 did anything
very useful. 

I'll just try polishing from-try3 and stick it back into try5.

Tue Jul 22 16:07:29 PDT 2008 Kevin Karplus

I173-P263/try11-opt3 came out pretty good, but the gromacs-optimized
version scored better, so I'm doing a polishing run to try to close
gaps and remove clashes.  Then I'll take a I173-P263/try12 model,
stick it back into try5, and polish the resulting complete model.


Fri Jul 25 04:29:52 PDT 2008 Kevin Karplus

I finally did it---I had try12 with "try2" instead of "try12"
everywhere and overwrote the existing try2 files in I173-P263/decoys

I'll redo try12 as try13, and try to get it right this time.

Fri Jul 25 05:38:26 PDT 2008 Kevin Karplus

I made a chimera of try5-opt3.gromacs0 and I173-P263/try13-opt3, using
L174-L261 from the subdomain model.

For try6, I will polish this chimera.

Fri Jul 25 11:33:37 PDT 2008 Kevin Karplus

try6 is the best model so far.  It seems that polishing the subdomain
with constraints that would make it easy to reinsert was effective in
producing a usable chimera.

I'll now do a polishing run starting from just the gromacs-optimized
models to try to cut down clashes and breaks.

I should then look to see if we have lost any sheets from the
templates, and see if fixup is needed.

Sat Jul 26 03:45:54 PDT 2008 Kevin Karplus
 
try7-opt3 polishes try6-opt3.gromacs and manages to reduce breaks
somewhat without losing anything significant on other cost functions,
though I'm a bit worried about the n_ca_c bond angles getting a little
too high a cost.

rosetta now likes try7-opt3.gromacs0.repack-nonPC best.

Some of the top models are a little off on the first strand, which is
present in try3
SheetConstraint (T0487)G5 (T0487)L11	(T0487)R315 (T0487)V309	hbond (T0487)T7	1

Perhaps I should copy M1-N12 and P306-I318 from try3-opt3.gromacs0

and G131-G149 from MQAU1-opt3.gromacs0.repack-nonPC

Sat Jul 26 13:07:20 PDT 2008 Kevin Karplus

try8-opt3, optimized from chimera-try7-try3-MQAU1, now scores the best
with the try8 costfcn, and try8-opt3.gromacs0.repack-nonPC has the
lowest Rosetta energy.

I like the fixes that were made in try8, and I'd like to add another
strand that has just barely been lost:
SheetConstraint	V9	N12	P583	R580	hbond	R580

I'll do a try9 optimization with this constraint added, starting from
all gromacs-optimized models (like try7)

Sat Jul 26 16:19:32 PDT 2008 Kevin Karplus

try9 increased breaks in order to get the extra sheet constraints.
What it actually did was to optimize try7-opt3 instead of try8-opt3.

I'll try again in try10, starting only from the try8 models and not
the try7 ones, and increasing the break cost a bit more.

Sat Jul 26 19:07:01 PDT 2008 Kevin Karplus

try10 does improve on try8-opt3.gromacs0, but try10-opt3.repack-nonPC
scores best with the try10 costfcn.  Rosetta likes best
try10-opt3.gromacs0.repack-nonPC. 

I'm curious where the I173-P263 domain comes from, so I'll submit it
to VAST.

Your VAST Search job was submitted at 07/26/2008
22:25:23(EDT). Request ID: 703700384036415691

Two long hits:
PDB C D	    Ali. Len.	SCORE	P-VAL   	RMSD	%Id	Description	
1R4K A   	81 	9.9 	10e-4.9 	1.7 	8.6 	Solution	Structure Of The Drosophila Argonaute 1 Paz Domain	
1YVU A 4	53 	7.9 	0.0025  	1.9 	11.3 	Crystal Structure Of	A. Aeolicus Argonaute		

Sat Jul 26 19:33:15 PDT 2008 Kevin Karplus

I think I've reached the point where further optimization is not going
to improve things.    I'll submit

ReadConformPDB T0487.try10-opt3.gromacs0.repack-nonPC.pdb	# < # try8-op3.gromacs0 < chimera-try7-try3-MQAU1
ReadConformPDB T0487.try9-opt3.pdb	# < try7-opt3.gromacs0 < try6-opt3.gromacs0 < chimera-try5-try13
ReadConformPDB T0487.try5-opt3.gromacs0.pdb	# < try4-opt3.gromacs0.repack-nonPC <  chimera-N2-MQAU1-C2
ReadConformPDB T0487.MQAU1-opt3.gromacs0.repack-nonPC.pdb	# < GS-KudlatyPred_TS2 
ReadConformPDB T0487.try3-opt3.gromacs0.pdb # < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA)

Sat Jul 26 19:46:21 PDT 2008 Kevin Karplus

Submitting with comment

    T0487 had obvious homology to 1yvuA, 1u04A, and for the separate
    domains to 1w9hA, 1r4kA, 1si2A, 1vynA, 1t2rA, and 1r6zA.

    I had the most trouble with the subdomain I173-P263, which did not
    appear to have homologs.   The model I ended up with appears to come
    primarily from 1r4kA, though it is also similar to 1yvuA.

    With such a large model, the normal-length optimization runs did not
    get as much gap closure and clash removal as they would have on
    shorter proteins.  I tried doing separate domains a little bit
    (most successfuly with N12-P306 and G302-V685), and pasting the pieces
    back together.  Some additional cut-and-paste was needed to get the
    initial strand properly in place.

    Although my MQA runs like Zhang-Server_TS3 best, my initial
    meta-server run ended up picking GS-KudlatyPred_TS2 as its primary
    template.  I submit this meta-server run as model 4, and included bits
    and pieces of it when gluing together some of my subdomain predictions.

    The final model is not highly polished, but further optimization is
    not likely to make huge improvements, and I'm pretty burned out by the
    end of CASP season.

    Model
    1    	T0487.try10-opt3.gromacs0.repack-nonPC.pdb	# < # try8-op3.gromacs0 < chimera-try7-try3-MQAU1
	    chimera-try7-try3-MQAU1:
		    mostly T0487.try7-opt3.pdb
		    M1-N12 and P306-I318 from try3-opt3.gromacs0
		    G131-G149 from MQAU1-opt3.gromacs0.repack-nonPC
	    try7-opt3 < try6-opt3.gromacs0 < chimera-try5-try13
	    try3-opt3 < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA)

    2    	T0487.try9-opt3.pdb	# < try7-opt3.gromacs0 < try6-opt3.gromacs0 < chimera-try5-try13
	    chimera-try5-try13:
		    mostly T0487.try5-opt3.gromacs0.pdb
		    L174-L261 from I173-P263/try13-opt3 < I173-P263/try11-opt3.gromacs0 < try3-opt3
	    try5-opt3 < try4-opt3.gromacs0.repack-nonPC <  chimera-N2-MQAU1-C2
	    try3-opt3 < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA)

    3    	T0487.try5-opt3.gromacs0.pdb	# < try4-opt3.gromacs0.repack-nonPC <  chimera-N2-MQAU1-C2
	    chimera-N2-MQAU1-C2:
		    M1-L121         MQAU1-opt3.gromacs0.repack-nonPC	< GS-KudlatyPred_TS2 
		    R122-T201       N12-P306/try2-opt3.gromacs0     < N12-P306/try1-opt3.repack-nonPC < align(1u04A?)
		    W202-L465       MQAU1-opt3.gromacs0.repack-nonPC	< GS-KudlatyPred_TS2 
		    S466-R574       G302-V685/try2-opt3.gromacs0    < G302-V685/try1-opt3 < align(1w9hA)
		    K575-P583       MQAU1-opt3.gromacs0.repack-nonPC	< GS-KudlatyPred_TS2 
		    V584-D660       G302-V685/try2-opt3.gromacs0    < G302-V685/try1-opt3 < align(1w9hA)
		    R661-V685       MQAU1-opt3.gromacs0.repack-nonPC	< GS-KudlatyPred_TS2 

    4    	T0487.MQAU1-opt3.gromacs0.repack-nonPC.pdb	# < GS-KudlatyPred_TS2 

    5    	T0487.try3-opt3.gromacs0.pdb # < try2-opt3.gromacs0 < try1-opt3.gromacs0.repack-nonPC < align(1u04A+1si2A+1zxxA)