Mon Jul 12 10:45:51 PDT 2004
T0235

Due 12 Aug

Mon Jul 12 18:34:00 PDT 2004	Kevin Karplus

Two domains? 
fold-recognition hit to d.15.1.1 for 1-90.
comparative model to 1nb8A for 104-484.

It probably won't be necessary to break this up for modeling, since
the two domains each have pretty good alignments.


From learithe@soe.ucsc.edu  Sat Jul 31 15:28:49 2004
Date: Sat, 31 Jul 2004 15:28:48 -0700 (PDT)
From: Jenny Draper <learithe@soe.ucsc.edu>
To: Kevin Karplus <karplus@soe.ucsc.edu>
cc: learithe@soe.ucsc.edu
Subject: Re: T0235

All I've done with T0235 so far is some literature searches, which
I'm still reading through before putting the highlights in the
readme. It's the ubiquitin-removal protease in the 26S proteasome
of yeast! There's some info about the active site & cystines as
the catalytic residues...

-Jenny


====================================================================
On Sat, 31 Jul 2004, Kevin Karplus wrote:

> According to the status file, you have T0235, Jenny, but I see no
> comments since mine on July 12 for try1.
>
> What is happening with it?
============================================================


Wed  Aug 4   1:00pm                  Jenny Draper

OK, I'm attempting to sum up my T0235 research here.

T0235, aka "UBP6" in the literature, is officially a "Ub-specific
processing protease (UBP)", and NOT a "Ubiquitin C-terminal hydrolase
(UHC) as it is titled in the CASP6 description (isn't biology
nomenclature FUN?). The only crystal structure known for UBP's is
1nb8/1nbf, for the catalytic domain(s) of the very large protein 
HAUSP. 1nbf is HAUSP bound to ubiquitin (Ub), and 1nb8 is HAUSP by 
itself, both solved and released in the same paper. Apparantly the 
catalytic site is split by about 9A w/o Ub, but it comes together 
upon binding to Ub, although the majority of the rest of the structure
remains unchanged (luckily ;).

We know that the N-terminal 80-ish residues form an Ubiquitin-like
fold, which binds to the 26s proteasome in yeast. This domain is not
essential for UBP6 (T0235) functioning, but _is_ essential for the
proteasome!

The catalytic residues are primarily a Cystine (C118 on our beast)
a Histidine (H447), and an Aspartic Acid (D219). They come together
at the junction between the "Palm" and "Thumb" regions of the 
structure. (The authors describe the HAUSP structure as a "Hand",
with Finger, Palm, and Thumb regions; Ub nestles in the palm
region, against the base of the finger and thumb regions). 

This is an extended structure, with a large open cleft for 
ubiquitin to sit in; we should avoid folding up the cleft.

The Big Question for this structure is: how to pack the Ub-domain
against the HAUSP domain?

Selected References:
-------------------------------------------------------------------
Hu M, Li P, Li M, Li W, Yao T, Wu JW, Gu W, Cohen RE, Shi Y.
Crystal structure of a UBP-family deubiquitinating enzyme in 
isolation and in complex with ubiquitin aldehyde.
Cell. 2002 Dec 27;111(7):1041-54. 

Wyndham AM, Baker RT, Chelvanayagam G.
The Ubp6 family of deubiquitinating enzymes contains a 
ubiquitin-like domain: SUb.
Protein Sci. 1999 Jun;8(6):1268-75.

Kim JH, Park KC, Chung SS, Bang O, Chung CH. 
Deubiquitinating enzymes as cellular regulators.
J Biochem (Tokyo). 2003 Jul;134(1):9-18. Review. 
-------------------------------------------------------------------


Wed  Aug 4   3:00pm                  Jenny Draper

Domain1, residues 1-80, looks good. It's an almost absolutely 
perfect copy of the structure of ubiquitin (PDB id 1ubq); all it 
needs is for residues 60-64 to be a _little_ more helical. 
Secondary structure predictions for this domain are very weak. 
The structure doesn't match the str2 script very well, but it 
matches stride pretty well. It's 5th, central strand is not 
predicted to be strand by stride, but forms a nice strand anyway.
I'd say we're done with this part, as we know from the literature
that this is an ubiquitin-like domain.

 
Our catalytic pocket looks good in the HAUSP domain -- it's got
all the right residue types in the right positions, with the 
same approximate distances as in 1nb8 (w/o Ub). (The H and C only
come close together upon Ub binding).


Superpositioning of try1-domain2 with HAUSP (1nbp8A) shows that
1. the fingertip region between 281-296 needs fixing. I think 
   undertaker is trying too hard to make 282-290 helical.
2. The region 352-427 is treated as an insertion within a loop
   region (~441-445 in 1nb8). This places T0235's His-box (~430-450)
   in the right place for the active site. Now... how to pack this
   helical region?
3. We don't have a match to 1nb8's C-terminus (1nb8 res 522-554).
   Perhaps our insertion shoud follow it... they're in the same
   region, though our insertion is longer.


Aug 4   6:00pm                  Jenny Draper

I'm running a try2 from alignments, using essentially the same 
settings, except I'm including all the alignments to 1nb8A. Also
I've upped the break cost, upped hbond geom, and turned down
predicted secondary-structure and phobic_fit costs.

I'm hoping this will help produce better structures in the
problem regions mentioned above.


Th   Aug 5  12:00pm            Jenny Draper

Try2 doesn't look like it came up with anything better, and the
insertion is still flopping out in space. I'll have to set up
some constraints to hold the basic shape, and then start polishing
the regions that need some help.


Fri Aug 6  4:00pm            Jenny Draper

I really have no idea what direction to take this in... I'm gonna
run a try3 with the sheet & helix constraints from try1, and hope
it can do some loop packing...


Sat Aug  7 13:43:24 PDT 2004 Kevin Karplus

try3 is still running (so it must have too many iterations for such a
large protein).

The rr constraints look pretty good in try2-opt2, except for
F111-Y430 and N116-S259, which suggests a different placement for the
residues up to N116, coming in on the other side of the sheet.

The try3 costfcn favors try1 over try2, and try3 is just a polishing
of try1.  There is no T0235.t04.many.frag.gz file, so I'll create one.


Perhaps the ubiquitin-like domain should be packed where ubiquitin is
in the 1nbf structure?  That is, if we align T0235 to 1nbfA and 1nbfD,
and cut-and-paste the pieces we may get a structure that makes sense.
(Of course, that assumes a monomeric structure---domain swapping could
be happening to get a multimeric structure.)  I'll add 1nbfA and 1nbfD
to the MANUAL_TOP_HITS, so that we can make extra_alignments to  get
alignments for them.


Sat  Aug 7  2:10pm            Jenny Draper

I really don't think the ubiquitin-like domain should pack into the
ubiquitin binding site. This protein has to function as an
ubiquitin hydrolase (ie, it needs that site open) -- the ubiquitin-
like domain should anchor the protein into the proteasome.


Sat Aug  7 14:08:36 PDT 2004 Kevin Karplus

One possibility is that when the protein is alone it binds its
ubiquitin-like domain, but when it is at the proteasome, the
proteasome binds the domain, opening up the binding pocket for
ubiquitin.

It looks like I'll have to add 1nbfA and 1nbfD to the template library
to get any decent alignments.  This might take a while.


Sat  Aug 7  2:15pm            Jenny Draper

True. I had thought of that. I think the protein is functional
in the absence of the proteasome though; I'll check the lit.
It's probably worth having that structure as one or two of
our models. 


Sat  Aug 7  3:15pm            Jenny Draper

Yep, I've found experimental evidence that purified Ubp6 has
ubiquitin-hydrolyzing activity, as does Ubp6 w/o the ubiquitin-
like domain. So if the Ub-like domain does sit in the active
site of Ubp6, it sure is easy to get it out of the way...

Mol Cell. 2002 Sep;10(3):495-507. 
Multiple associated proteins regulate proteasome structure and function.
Leggett DS, Hanna J, Borodovsky A, Crosas B, Schmidt M, Baker RT, 
Walz T, Ploegh H, Finley D.


Sun Aug  8 07:35:24 PDT 2004 Kevin Karplus

I aligned try3-opt2 with 1nbfA and 1nbfD, and took the first 99
residues from the alignment to 1nbfD and the rest from the alignment
with 1nbfA.  The result is in decoys/docked-chimera.pdb
This model scores poorly, because of the bad break at Q99-Q100,
because of clashes, and because of the helix constraint for A96-Q103.
I wonder if those problems can be fixed without undocking the first
domain.   I'll try that for try4.  I added some arbitrary constraints to hold
the docked domain in place, then tweaked the cost function until the
docked-chimera barely scored best.


Sun Aug  8 20:02:22 PDT 2004 Kevin Karplus

try4 is STILL running.  We'll have to remember to reduce the number of
iterations for future runs on T0235.

From learithe@soe.ucsc.edu  Mon Aug  9 13:42:36 2004
MIME-Version: 1.0
Date: Mon, 9 Aug 2004 13:42:35 -0700 (PDT)
From: Jenny Draper <learithe@soe.ucsc.edu>
To: Kevin Karplus <karplus@soe.ucsc.edu>
Subject: T0235 inserted domain
In-Reply-To: <200408082132.i78LWZw7000893@cheep.cse.ucsc.edu>


I believe the inserted domain in T0235 is between
Pro347 and Pro246


-Jenny


Mon Aug  9 15:13:14 PDT 2004 Kevin Karplus


That can't be right.  I think Jenny meant P347-P426.

I've set up 347-426 as a subdomain.  I modified the try1.costfcn
before starting, so that the two prolines were constrained to have the
same orientation and spacing as in the try4-opt2 model, which should
make pasting the result back in a bit easier.


Mon Aug  9 17:45:11 PDT 2004 Kevin Karplus

try1 of 347-426 looks ok, but I'm doing another run to see if I can
make it a bit more compact.  Then Jenny should create a chimera by
pasting it into try4-opt2, and reoptimize with a cost function that
has constraints turned way down.  

We probably can't afford to turn constraints off entirely, as the
unconstrained.costfcn barely scores try4-opt2 better than try3-opt2,
and the extra breaks or clashes that the chimera will have will
probably make it look slightly worse.  It will probably be necessary
to tweak the next costfcn in order to make the chimera barely look
better than try4-opt2.

Actually, while we are waiting for the subdomain to finish building, I
might as well try polishing try4 with the unconstrained cost fcn.
Of course, this may move the two prolines that the subdomain is
expecting to link to, but we can either re-optimize the subdomain for
the new position of the prolines, or if they don't move much, just
link in the subdomain and let optimization try to close the gaps.

Mon Aug  9 23:21:42 PDT 2004 Kevin Karplus

try5-opt1 only recently finished---it will probably take the rest of
the night for try5-opt2 to finish.  I migt as well pick up new edge
constraints  for the subdomain from try5-opt1 though, and apply them
to the subdomain.

P347.N	P426.N		13.372
P347.CG P426.CG		13.780
P347.CA	P426.CA		12.698
P347.O	P426.O		12.820
P347.C	N425.C		11.663
E348.CA	N425.CA		11.540
E348.C	N425.N		10.110

Note: these residues do not seem to have moved significantly between
try4-opt2 and try5-opt1, so I don't expect much motion for try5-opt2 either.
With the extra constraints, the ends should be rigid enough that it
should be easy to superimpose the subdomain and try5-opt2: just gut
try5-opt2 by removing 349-424, then superimpose the two incomplete conformations.
Since the only residues they share are 347-348 and 425-426, the
superposition should put the subdomain precisely where it is wanted.
Then cut-and-paste to remove the extra residues.

Both try3 on the subdomain and try5 should be done in the morning.


Tue Aug 10 11:25:47 PDT 2004 Kevin Karplus

gutted-try5.pdb.gz is try5-opt2 with residues 349-424 removed.
Superimposing the 347-426/decoys/T0235.try3-opt2.pdb with
gutted-try5.pdb.gz looks really terrible--the subdomain is sticking
way into the main body (try5-overlapping-chimera.pdb).

Superimposing 347-426/decoys/T0235.try3-opt2.pdb with
decoys/T0235.try5-op2.pdb (try5-opt2-plus-sub.pdb)
produces more modest clashes, and some bad breaks, but may be fixable
with some opt and jiggle segment operations and the gap-closing operators.

At some point I'm going to have to give undertaker the ability to
handle frozen atoms, so that it can do optimization of a subdomain
like this in the presence of an unchanging environment.

I cut-and-pasted the model to make decoys/try5-chimera.pdb, which
scores very badly with the try5 costfcn (almost as badly as "docked-chimera").

I'll start a shorter run for try6 to optimize just the try5-chimera.


Tue Aug 10 20:57:03 PDT 2004 Kevin Karplus

try6 is junk---the added helices all scattered.

I think we should submit
	try5-opt2	best with try6 (unconstrained) costfcn
	try3-opt2	best before trying to dock ubiquitin-like domain into binding pocket
	try1-opt2	fully automatic run
	T0235-1nb8A-t2k-local-str2+CB_burial_14_7-0.4+0.4-adpstyle5	
	T0235-1ogwA-t04-local-str2+CB_burial_14_7-0.4+0.4-adpstyle5
The two alignments are for the two domains.


I'll set this up and submit it tonight.  We can resubmit if Jenny
finds something better tomorrow.


Fri Sep 24 21:24:55 PDT 2004 Kevin Karplus

Evaluating with smooth GDT we get
name			length	missing_atoms	rmsd	rmsd_ca	GDT	smooth_GDT
model3.ts-submitted	499	 0.0000	 8.0346	 7.2481	-50.6831	-47.8390	full-auto
model2.ts-submitted	499	 0.0000	 8.0179	 7.2024	-50.6831	-47.5765	try3
model4.ts-submitted	499	1727	 4.3967	 3.6061	-50.3415	-46.0765	alignment
model1.ts-submitted	499	 0.0000	 8.5740	 7.8166	-48.4973	-45.1164	try5
robetta-model1.pdb.gz	499	 0.0000	22.4770	22.1644	-41.0519	-38.8299
robetta-model5.pdb.gz	499	 0.0000	21.9268	21.9610	-40.8470	-38.6745
robetta-model3.pdb.gz	499	 0.0000	28.1988	28.0913	-40.9153	-38.4555
robetta-model2.pdb.gz	499	 0.0000	22.4606	22.4694	-39.5492	-37.2889
robetta-model4.pdb.gz	499	 0.0000	24.9812	24.9039	-39.0027	-37.0565
model5.ts-submitted	499	3538	 0.0000	 0.0000	 0.0000		0.0000		alignment

1vjvA is incomplete, and model5 has no overlap with the solved part.
Our best model is the full auto one!
At least we beat robetta.


Fri Nov 26 11:06:48 PST 2004 Kevin Karplus

Domain : T0235_1 : CM/easy : NT=309 : 107-356,427-499
Domain : T0235_2 : FR/A : NT=43 : 357-415

#Target	best	best	model1	auto	align	robetta	robetta
#	sam-t04	submit				best	1
T0235 	47.8390	47.8390	45.1164	47.8361	46.5762	38.8299	38.8299
T0235_1	55.9401	55.9391	53.3912	55.9401	54.7818	46.7985	46.7985
T0235_2	38.1419	38.1419	38.1419	37.9981	 0.0000	44.8487	41.4253

The crystal structure doesn't resolve all of domain 2.  We have ok
values for it only because we have the first and last helix
constrained by the comparative modeling domain (which we did fairly
well on).  We did not make the last helix of the domain long enough,
though we did have it predicted to be longer than we made it.

I made 347-426/decoys/evaluate_2.rdb to see how well we did on the
inserted domain.  Our best result was for our first alignment, to
2occJ, which we then proceeded to mess up.  Luckily, we did not end up
submitting any of the predictions with the messed up domain 2
prediction.

We can't tell for sure whether Jenny was right about the
ubiquitin-like domain not being in the binding pocket---it was not
part of the crystal.  I don't know if it was cut off, or if it was
flopping around and so not solved.  Since it wasn't consistently in
the binding pocket, I suspect that Jenny was right.