Mon May  5 11:07:00 PDT 2008
T0387
Make started Mon May  5 11:07:33 PDT 2008
Running on cheep.cse.ucsc.edu
Make started Mon May  5 11:14:15 PDT 2008
Running on cheep.cse.ucsc.edu
Make started Mon May  5 11:20:07 PDT 2008
Running on cheep.cse.ucsc.edu

Mon May  5 11:21:26 PDT 2008 Kevin Karplus

The first two runs failed, because the CASP organizers hid the
sequences behind a database interface, and I had to manually cut and
paste the sequences to get it into the a2m file.

I sent them a query about their having set up a harder-to-use
interface than at CASP7, but I may be forced to do manual cut and
paste all summer, since they don't even seem to have a way to retrieve
sequences given the target number!

Despite their claims that human predictions would primarily be for
low-homology targets, this first target has an almost perfect match in
the database (2eejA has 84 identical residues and the target has only
91).  This is clearly a "how well can you polish this" problem,

The target is a tetramer at pH=6.2 (which may affect His protonation).

Ah, 2eejA is an NMR structure, so this is an NMR-to-cyrstal conversion problem.
There are floppy ends on the NMR structure of 2eejA (before P8 and after C87).
The two CYS in 2eejA do not form a disulfide, nor do they cluster with
the sole HIS, so there seems to be no reason to use maybe_metal or
maybe_ssbond with this target.

We may want to use as templates other models (not just the first one in
the 2eejA file), to sample the floppy ends better.

Looking up the top hits in the T0387.pdb_blast.txt file at
http://www.ebi.ac.uk/msd-srv/pqs/ I am seeing mainly monomeric
proteins (with a few "no hits:, which could mean NMR models).
So the tricky thing may be to figure out the tetramer.  Is is a pair
of dimers? a ring?

Mon May  5 11:44:43 PDT 2008 Kevin Karplus

The t06 alignment found over 9000 similar sequences in nr.  This
appears to be a PDZ domain from PDZ-domain-containing protein 1, as is
2eejA.  The highly conserved residues of the domain are 
	G17		G
	LIVF20		L
	VILA31		I
	LIVM45		L
	DES49		D
	ILV51		I
	LIMVF71		I
	LIMFV80		L


The closest hit in SCOP seems to be 1i92A, the first PDZ domain of
Na+/H+ exchanger regulatory factor, NHERF.

Pfam has this to say about PDZ domains:

    Literature references

       1. Doyle DA, Lee A, Lewis J, Kim E, Sheng M, MacKinnon R; ,
          Cell. 1996;85:1067-1076.: Crystal structures of a complexed
          and peptide-free membrane protein-binding domain: molecular
          basis of peptide recognition by PDZ. PUBMED:8674113
    
       2. Ponting CP, Phillips C, Davies KE, Blake DJ , Bioessays
          1997;19:469-479.: PDZ domains: targeting signalling
          molecules to sub-membranous sites. PUBMED:9204764
    
       3. Ponting CP; , Protein Sci 1997;6:464-468.: Evidence for PDZ
          domains in bacteria, yeast, and plants. PUBMED:9041651

    Interpro entry IPR001478

    PDZ domains are found in diverse signaling proteins in bacteria,
    yeasts, plants, insects and vertebrates PUBMED:9041651,
    PUBMED:9204764. PDZ domains can occur in one or multiple copies
    and are nearly always found in cytoplasmic proteins. They bind
    either the carboxyl-terminal sequences of proteins or internal
    peptide sequences PUBMED:9204764. In most cases, interaction
    between a PDZ domain and its target is constitutive, with a
    binding affinity of 1 to 10 然. However, agonist-dependent
    activation of cell surface receptors is sometimes required to
    promote interaction with a PDZ protein. PDZ domain proteins are
    frequently associated with the plasma membrane, a compartment
    where high concentrations of phosphatidylinositol 4,5-bisphosphate
    (PIP2) are found. Direct interaction between PIP2 and a subset of
    class II PDZ domains (syntenin, CASK, Tiam-1) has been
    demonstrated.

    PDZ domains consist of 80 to 90 amino acids comprising six
    beta-strands (betaA to betaF) and two alpha-helices, A and B,
    compactly arranged in a globular structure. Peptide binding of the
    ligand takes place in an elongated surface groove as an
    antiparallel beta-strand interacts with the betaB strand and the B
    helix. The structure of PDZ domains allows binding to a free
    carboxylate group at the end of a peptide through a
    carboxylate-binding loop between the betaA and betaB strands.

So one question that immediately springs to mind is whether the PDZ
domains are forming the tetramer by binding the C-terminal ends of the
other monomers.

PQS does have some dimeric and tetrameric proteins that come up when
searching for PDZ:
1obx	and 1oby	tetrameric
	(actually, they are dimeric, with separate peptides bound)


pdb	num	SpGrp	delta	num	num	num	num	num	num	percent	delta	type
id	biol	name	ASA	S-S 	SaltB	buried	chain	resid	hetatm	ASA	sole	 
2g2l_1	2 	P	2988.5	0	0	0	2	94	0	50.1	78.7	DIMERIC
1g9o_0	1 	P3221	1205.4	0	0	0	2	182	0	19.6	-0.4	DIMERIC
2i04_2	2 	P	646.4	0	0	0	2	92	5	17.9	-4.0	DIMERIC
2i04_1	2 	P	612.4	0	1	0	2	90	5	17.3	-4.2	DIMERIC
1q3p_2	2 	P41212	524.8	0	2	0	2	109	0	13.9	0.6	DIMERIC
1q3o_0	1 	P1211	510.6	0	0	0	2	208	5	8.0	-8.0	DIMERIC
2i0i_3	3 	C	507.9	0	0	0	2	87	0	14.5	-3.7	DIMERIC
1q3p_1	2 	P41212	504.8	0	2	0	2	104	0	14.3	-0.5	DIMERIC
2i0i_2	3 	C	487.0	0	0	0	2	87	0	14.2	-3.9	DIMERIC
2i0i_1	3 	C	424.2	0	0	0	2	87	0	12.7	-3.2	DIMERIC
1ihj_2	2 	P1	332.6	0	0	0	2	99	0	10.3	-4.9	DIMERIC
2i0l_1	2 	C	313.2	0	0	0	2	166	0	5.1	1.2	DIMERIC
1ihj_1	2 	P1	308.2	0	0	0	2	100	0	9.5	-5.8	DIMERIC
2i0l_2	2 	C	15.5	0	0	0	2	11	0	1.3	-1.1	DIMERIC
2g2l_2	2 	P	0.0	0	0	0	2	0	0	0.0	0.0	DIMERIC


The 1g9o dimer does have the C-termini bound by the other domain, not
not making the full strand of the sheet that PDZ-bound peptides
usually form, though the carboxyl terminus does form H-bonds to 
2 backbone N atoms in the binding pocket.
2i0i has 3 monomers, reported as dimers by PQS because of the bound peptides.

Mon May  5 12:41:02 PDT 2008 Kevin Karplus

The t2k iterated search finds a quite different signal than the t04
and t06 searches, though it is based on just over 6000 sequences:

	VIL31		I
	VILPA34		V
	AGS40		A
	LIVM45		L
	DE49		D
	ILV51		I
	VILA54		V
	NDG55		N

Even this model has drifted a bit from the target, with 1g9oA scoring
best, and 2eejA in 13th place.

All the top hits are, of course, the PDZ domain family, so are
structurally quite close.

Since 2eejA is not in the template library, we may need to add it
manually to the set of alignments to consider.

Mon May  5 14:01:14 PDT 2008 Kevin Karplus

try1-opt3 folds the C-terminal tail back into the protein, which is
most likely wrong---we probably have the crystal formed by a
domain-swapped dimer.

I'm adding a "MANUAL_TOP_HITS" definition to the Makefile, so that I
can make alignments to all the top hits:
MANUAL_TOP_HITS:= 2eejA 2ocsA 1i92A 1gq4A 1g9oA 1gq5A

For try2, I'll do another monomer optimization, but after that I'll
have to figure out how to do dimer optimization.


Mon May  5 18:56:21 PDT 2008 Kevin Karplus

I set up a bunch of stuff in the Make.main file for creating dimers a
little more easily than in casp7, and have started a dimer run in dimer/
I added 
    ConstraintSet dimer_pair
    Hbond V91.O 	G108.N
    Hbond V182.O	G17.N
to the costfcn to try to get the C-termini into the proper binding
pocket in the other monomer.  I'm pretty sure that G17 is one of the
two N-atoms that Hbonds to the carboxyl terminus.

Mon May  5 20:50:38 PDT 2008 Kevin Karplus

I was just checking what the servers did:
	SAM_T02: alignments to 1g9oA, 2ozfA, 2he4A, 2ocsA, 1tp5A
	SAM_T06: undertaker crashed, so just models from alignments to
		1g9oA, 2ozfA, 2he4A, 2he2A
	SAM_T08: submitted try1-opt3, try1-init, and models from
		alignments to 1g9oA, 2ozfA, 1tp5A

Tue May  6 10:17:02 PDT 2008 Kevin Karplus

I think that the try2 run did not include the 2eejA alignments, so I'm
trying again with try3.  I also two constraints on the final
residue, to make the tail stick out more or less where the tail does
on 1g9oA, so that dimerization might work better.  I'll use the try3
costfcn for the initial selection (with clashes scaled down), instead
of MQA_init.costfcn

I might want to see if I can add an OXT atom to the end of the chain,
so that I can have both Hbonds in the dimer, but that won't be in try3.

Tue May  6 10:27:04 PDT 2008 Kevin Karplus

The dimer in dimer/decoys/T0387.try2-opt3.pdb.gz has some bad breaks,
but the second copy of the dimer looks pretty good.  I might want to
take those two monomers and put them into a polishing run for the
monomers, before building another dimer.

Tue May  6 12:59:13 PDT 2008 Kevin Karplus

The try3-opt3 monomer scores a bit better than the monomers from the dimer:
decoys/dimer-try2-opt2-A.pdb.gz and decoys/dimer-try2-opt2-B.pdb.gz

For try4, I'll run the same script as try3, but I'll try starting from
a blank pdb file that has an OXT atom on the end, to see if that will
give me monomers with OXT.  If it does, I'll make the dimer from there.

Tue May  6 13:14:42 PDT 2008 Kevin Karplus

That did not work---I got an assertion failure trying to read in the file:

# ReadTargetPDB reading from PDB file T0387.plusOXT.pdb looking for model 1
undertaker: Conformation.cc:399: virtual void Conformation::append_fragment(int, const ChainsResiduesAndAtoms*, int, int): Assertion `Master->atom(splice2_N_atom-1).no_wc_match( PDBAtomAlphabet->to_base("C"))' failed.
Warning: all-zero PDB file read in ReadTargetPDB, so making up random conformation

Maybe I should just make a dimer of the try3-opt3 monomer and not
worry about OXT for now.

Tue May  6 15:42:21 PDT 2008 Kevin Karplus

After a little less than an hour, the dimer/try3 run failed with an
assertion failure.  I think I've fixed the bug (trying to close the
KnownBreak), and am trying again.

The dimer/try3-opt2 file (now
dimer/decoys/T0387.try3-opt2-run1.pdb.gz) looks fairly good--the
C-terminus does neatly fit into the binding pocket of the other
monomer.

Tue May  6 17:58:55 PDT 2008 Kevin Karplus

The dimer/try3-opt3 file looks pretty good.  Perhaps I should do a
polishing run to try to pack things a little tighter, then split up
the best dimers into monomers.

Tue May  6 19:47:11 PDT 2008 Kevin Karplus
The dimer/try4-opt3 run looks pretty good, though the clashes are
higher than I would like.

For try4, I'll correctly run the OptConform with "multimer 2" (which I
had forgotten about) and make T0387.mult4 instead of T0387.try4, so
that gromacs will be run correctly on the unpacked dimers.

Tue May  6 19:52:11 PDT 2008 Kevin Karplus

Oops, but starting T0387.mult1 in dimer, I accidentally stepped on
dimer/try1.costfcn 
I killed the run before any further harm was done, and mult2, mult3,
and mult4 ran without trouble.

Tue May  6 20:14:37 PDT 2008 Kevin Karplus

The mult5 run failed with an assertion failure in undertaker after
quite a while.  I'll have to set the seed and rerun it under the debugger.

Tue May  6 20:59:36 PDT 2008 Kevin Karplus

Even with the seed set it didn't crash under the debugger.
I hat intermittent faults!!

I think I'll just make some minor mods to the dimer/try5.under file
and rerun, hoping not to crash.

Tue May  6 21:26:14 PDT 2008 Kevin Karplus

Without the debugger, it crashed somewhat later at a different assertion.
This is getting irritating!

I don't think that it was even doing much to improve the dimer.

Wed May  7 08:43:25 PDT 2008 Kevin Karplus

I wonder if I should try an additional model: a cyclic tetramer.  It
seems less likely than a dimer, but the packing of the dimer with a
lot of negative charges clustered in the dimer interface seems
unlikely. The clashes between the D47 residues is particularly bad.

Opening it up to a cyclic tetramer might relieve the clashes.   I
don't have a tetramer to work from, and I wonder if I can create one
with just undertaker.  If not, I'll have to try using Proteinshop.

Wed May  7 09:19:47 PDT 2008 Kevin Karplus

I tried making a stupid tetramer (putting two copies of the dimer in
exactly the same place), to see if undertaker could turn it into a
cyclic tetramer, using TweakMultimer.  (Maybe OptSubtree would be better?)

Wed May  7 09:32:17 PDT 2008 Kevin Karplus

That didn't work: the duplicated atoms were marked as missing, and
undertaker can't optimize an incomplete conformation, so it crashed
for having no conformations!

Maybe I should up InsertAlignment and start from a random conformation.

Wed May  7 12:00:55 PDT 2008 Kevin Karplus

The tetramer/try1 run does seem to form a tetramer that has the right
conformation for the core and satisfies the constraints I gave it for
the C-terminal docking, but I might want to add more constraints, to
try to get the C-terminus to approach the normal PDZ binding as an
extra strand of the sheet.  There does not seem to be the
buried-charge problem of the dimer.

Wed May  7 12:29:59 PDT 2008 Kevin Karplus

tetramer/try2 will attempt to form a tetramer with more normal binding
of the C-terminus into the binding pocket.


Wed May  7 13:55:34 PDT 2008 Kevin Karplus

tetramer/try2 again does a decent job of getting the constraints I specified,
but not quite with a believable tetrameric structure.
Maybe I should increase the strand constraints for a full 6 C-terminal residues.

Having more conformations in the initial multimerization pass might
also help.

Wed May  7 18:39:57 PDT 2008 Kevin Karplus

tetramer/try3 does a little better, but the multimers are too spread out and
there is a bend around Q88.  If we could make the tail straighter, we
could probably get a tighter packing.   Maybe I should add some
constraints between R115 and E46, D47, or E48, to try to pull things
together. 

Thu May  8 08:59:08 PDT 2008 Kevin Karplus

tetramer/try4 is a complete mess, with bad conflicts between the monomers.
The E46-E48 contacts with R115 are made, but the originally desired
ones between V91 and F109 are not.

This is where I'd really like a manipulable model, so that I could put
the monomers roughly where I think they ought to go, and tweak the
C-terminal strand to fit.

Sat May 10 08:20:29 PDT 2008 Kevin Karplus

I picked up the server tarball, and socred everything with
the MQA_init costfcn.  The dimer/try5-opt1 model scores best, followed
by try1-opt3.
The best-scoring external one is HHpred5_TS1 (not surprising for a
close homology model).  I should try building dimers and tetramers off
of that model.


Mon May 12 10:25:20 PDT 2008 Kevin Karplus

I made some changes to the Make.main file, so that we can make
undertaker scripts to read the top 10 models from the MQA
evaluations.  I've started a run from the top 10 MQAC models, using
the try3 costfcn.

Mon May 12 11:59:50 PDT 2008 Kevin Karplus

The MQAC-try1 run seems to have favored Pcons_multi_TS3, which
actually comes out quite close to try1-opt3 and to  Zhang-Server_T3
and our first alignment (to 1g9oA).

All the core residues are essentially in the same places in all these
models---even the sidechains superimpose very well.

Mon May 12 12:04:49 PDT 2008 Kevin Karplus

I'll do another run from the top 10 MQAU-chosen models, again with
try3 as the costfcn.  I've modified the metaserve-MQAx.under scripts
to skip the TryAllAlign stuff at the beginning---these are highly
polished models already, and adding alignments is not likely to help.

After that, I should try polishing from the full set and making a
dimer again.  Making the right tetramer is probably more important,
and more difficult.


Mon May 12 12:30:40 PDT 2008 Kevin Karplus

The MQAU run seems to favor 3D-JIGSAW_AEP_TS1


Mon May 12 12:46:48 PDT 2008  John Archie

The QA files have been submitted.

Mon May 12 12:50:50 PDT 2008 Kevin Karplus

The core of all our predictions (and a lot of the server predictions)
is the same, superimposed to half an Angstrom or less.
The C-terminal tail is what varies the most, and that is almost
certainly determined by the multimerization.

The tail for MQAU1 is different from other models, even different from
the 3D-JIGSAW_AEP_TS1 that the run favored at the beginning.  It makes
some Hbonds at the end , and looks like a reasonable monomeric
solution, but I don't think it will multimerize well.

I have to figure out how to get a good tetramer still.


Fri May 23 14:42:44 PDT 2008 Kevin Karplus

We don't seem to have a working version of ProteinShop, and Baker has
not released a version of FoldIt that we can use, so I'll have to get
the effect I want with undertaker, which may be difficult.

I could try breaking off the C-terminal peptide and docking it with
undertaker, but how do I then convert that into the tetramer?

Fri May 23 15:00:04 PDT 2008 Kevin Karplus

For try5, I broke off the C-terminal end, and am trying to dock it
into the normal PDZ binding site.  If that works, I'll try to find a
way to make a tetramer out of the two pieces, perhaps by superimposing
one contiguous model on the first 83 residues and another model on the
remaining residues, then symmetrizing.

Fri May 23 15:53:41 PDT 2008 Kevin Karplus

RATS! try5 did not put the C-terminal peptide where I wanted.  It
seems that it moved the gap away from before G84 by inserting
fragments, so that OptSegment never got a chance to fix it.

I'll try again with ONLY the segment operations for the
opt1 part (try6).

Fri May 23 17:19:24 PDT 2008 Kevin Karplus

Nope---that doesn't do it.  It looks like the Opt operations put it
back together----they must not check KnownBreak.

Fri May 23 18:16:11 PDT 2008 Kevin Karplus

I tried fixing just the OptSubtree and OptSegment operators, and it
didn't help either.

So I made sure that only constraints from the costfcn were
included in all the operations that chose constraints, and remved all
constraints except the final_tail constraints from the initial run for try8.


Fri May 23 21:01:43 PDT 2008 Kevin Karplus

try8 managed to make one of the Hbonds, but never placed the strand
correctly.  I don't know if this is a bug in OptSegment, or just a bad
costfcn (perhaps with too high a clash penalty).  For try9, I'm trying
again with lower clash penalty and with "ReportCost try9.rdb" so that
I can see what costs are being generated.

Fri May 23 21:21:22 PDT 2008 Kevin Karplus

Oops---ReportCost only affects the CostConform commands, not the
OptConform commands.  I need to add report_all_costs to the OptConform
command as well!

Sat May 24 05:38:53 PDT 2008 Kevin Karplus

On try9, looking at how the final_tail cost compares to other costs
(in gnuplot, using commands like 

	plot '< smooth-rdb -name1 final_tail -name2 soft_clashes < try9.rdb' with lines 
)

I can see that the there were a few models built in which the
final_tail values got down to low costs, but that they had bad clashes
and bad dry5, dry6.5, and dry8 values (some other costs also got bad).
The combined effect was that an improvement of 0.5 in final_tail
incurred a total cost increase of about 100, so we'd need to add about
200 to the weight of final_tail to save these models. 

Let me try that for try10.

Sat May 24 06:21:51 PDT 2008 Kevin Karplus

It looks like (with sufficient weight) we can force the final_tail
constraints, but with terrible clashes and loss of H-bonds.
So the problem is not an algorithmic one, but just that getting a good
fit is difficult.

Sat May 24 10:15:28 PDT 2008 Kevin Karplus

The bent C-terminal tail in try10 does *not* form the desired sheet,
though it does make the V91.O Hbonds.  Instead it sticks into the main
domain on the wrong side of the strand and disrupts everything,
Perhaps what I need to do is to chop the tail up more so that it can
be reassembled more readily, and allow fragment insertions earlier.

I could probably also reduce the weight of final_tail a bit, so that
it is not quite so insistent on it at the expense of everything else,
and reduce the V91.O hbonds relative to the strand.  Adding a
constraint that V91-K86 is around 15.9 Angstroms CA-CA might help also.

Sat May 24 12:26:38 PDT 2008 Kevin Karplus

try11-opt3 is a little closer to what I want, but still pretty messed up.
undertaker does not seem to be a good tool for this sort of docking!
Part of the problem THIS time is that I allowed fragment and alignment
insertion early, so that the known break at G84 went away, and
undertaker was then trying to keep the break before A87 small.

Let me try once more, with the fragment and alignment insertion
operators initially turned off.

Sat May 24 17:33:09 PDT 2008 Kevin Karplus

Nope, try12 has the same trouble.  I think I'll probably give up on
this approach.  The question now is whether I can get a tetramer, or
if I should just report the best monomers using the try4 costfcn,
which I'll recompute after changing the costfcn-init.under script to
use stricter clash definitions.


 T0387.MQAU1-opt3.pdb        -30.0 -14.0 -1.9 -1.5 -37.4 -15.0 -2.4 -3.5 -4.7  8.4 48.2 46.9 132.8 141.2 -2.3 -55.3  2.3 0.9   7.3  0.1    0.0 -5.2 -3.8 -4.3 -5.7  201.03
 T0387.try3-opt3.pdb.gz      -30.0 -14.8 -2.0 -1.7 -37.5 -14.0 -2.5 -3.4 -4.5  8.0 48.7 49.0 138.0 145.8 -2.2 -55.5  3.0 0.6   6.2  0.3    1.0 -4.9 -3.9 -4.5 -5.9  213.39
 T0387.MQAC1-opt3.pdb        -29.9 -15.0 -1.8 -1.5 -38.1 -14.8 -2.6 -3.4 -4.5  8.2 49.6 50.0 139.0 147.2 -2.1 -55.3  2.7 1.1   4.4  0.3    0.7 -5.0 -3.7 -4.5 -5.4  215.43
 T0387.try8-opt1.pdb.gz      -19.0 -14.0 -1.8 -1.0 -36.4 -14.7 -2.6 -3.0 -4.5  8.1 50.1 49.6 137.1 146.2 -2.0 -54.4  2.7 1.3   6.6  0.4    7.7 -5.0 -3.7 -4.4 -5.7  237.43


Sat May 24 20:16:57 PDT 2008 Kevin Karplus

I gave up on this model and submitted

Model
1	T0387.MQAU1-opt3.pdb	a metaserver model
2	T0387.try3-opt3.pdb	a native SAM/undertaker model
3	T0387.MQAC1-opt3.pdb	a metaserver model (from consensus scoring)
4	alignment T0387-1g9oA-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m
5	alignment T0387-2ozfA-t06-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m


Sun Jun  8 15:52:03 PDT 2008 Kevin Karplus

I just noticed today that the KnownBreak commands in the
dimer/costfcn-init.under and tetramer/costfcn-init.under files were
wrong---they had bare numbers, which are interpreted as atom numbers
rather than as residue numbers.  (I should probably fix this in undertaker!)
Make started Thu Feb 12 13:06:54 PST 2009
Running on peep.cse.ucsc.edu