Mon Jul 12 10:45:51 PDT 2004 T0235 Due 12 Aug Mon Jul 12 18:34:00 PDT 2004 Kevin Karplus Two domains? fold-recognition hit to d.15.1.1 for 1-90. comparative model to 1nb8A for 104-484. It probably won't be necessary to break this up for modeling, since the two domains each have pretty good alignments. From learithe@soe.ucsc.edu Sat Jul 31 15:28:49 2004 Date: Sat, 31 Jul 2004 15:28:48 -0700 (PDT) From: Jenny Draper To: Kevin Karplus cc: learithe@soe.ucsc.edu Subject: Re: T0235 All I've done with T0235 so far is some literature searches, which I'm still reading through before putting the highlights in the readme. It's the ubiquitin-removal protease in the 26S proteasome of yeast! There's some info about the active site & cystines as the catalytic residues... -Jenny ==================================================================== On Sat, 31 Jul 2004, Kevin Karplus wrote: > According to the status file, you have T0235, Jenny, but I see no > comments since mine on July 12 for try1. > > What is happening with it? ============================================================ Wed Aug 4 1:00pm Jenny Draper OK, I'm attempting to sum up my T0235 research here. T0235, aka "UBP6" in the literature, is officially a "Ub-specific processing protease (UBP)", and NOT a "Ubiquitin C-terminal hydrolase (UHC) as it is titled in the CASP6 description (isn't biology nomenclature FUN?). The only crystal structure known for UBP's is 1nb8/1nbf, for the catalytic domain(s) of the very large protein HAUSP. 1nbf is HAUSP bound to ubiquitin (Ub), and 1nb8 is HAUSP by itself, both solved and released in the same paper. Apparantly the catalytic site is split by about 9A w/o Ub, but it comes together upon binding to Ub, although the majority of the rest of the structure remains unchanged (luckily ;). We know that the N-terminal 80-ish residues form an Ubiquitin-like fold, which binds to the 26s proteasome in yeast. This domain is not essential for UBP6 (T0235) functioning, but _is_ essential for the proteasome! The catalytic residues are primarily a Cystine (C118 on our beast) a Histidine (H447), and an Aspartic Acid (D219). They come together at the junction between the "Palm" and "Thumb" regions of the structure. (The authors describe the HAUSP structure as a "Hand", with Finger, Palm, and Thumb regions; Ub nestles in the palm region, against the base of the finger and thumb regions). This is an extended structure, with a large open cleft for ubiquitin to sit in; we should avoid folding up the cleft. The Big Question for this structure is: how to pack the Ub-domain against the HAUSP domain? Selected References: ------------------------------------------------------------------- Hu M, Li P, Li M, Li W, Yao T, Wu JW, Gu W, Cohen RE, Shi Y. Crystal structure of a UBP-family deubiquitinating enzyme in isolation and in complex with ubiquitin aldehyde. Cell. 2002 Dec 27;111(7):1041-54. Wyndham AM, Baker RT, Chelvanayagam G. The Ubp6 family of deubiquitinating enzymes contains a ubiquitin-like domain: SUb. Protein Sci. 1999 Jun;8(6):1268-75. Kim JH, Park KC, Chung SS, Bang O, Chung CH. Deubiquitinating enzymes as cellular regulators. J Biochem (Tokyo). 2003 Jul;134(1):9-18. Review. ------------------------------------------------------------------- Wed Aug 4 3:00pm Jenny Draper Domain1, residues 1-80, looks good. It's an almost absolutely perfect copy of the structure of ubiquitin (PDB id 1ubq); all it needs is for residues 60-64 to be a _little_ more helical. Secondary structure predictions for this domain are very weak. The structure doesn't match the str2 script very well, but it matches stride pretty well. It's 5th, central strand is not predicted to be strand by stride, but forms a nice strand anyway. I'd say we're done with this part, as we know from the literature that this is an ubiquitin-like domain. Our catalytic pocket looks good in the HAUSP domain -- it's got all the right residue types in the right positions, with the same approximate distances as in 1nb8 (w/o Ub). (The H and C only come close together upon Ub binding). Superpositioning of try1-domain2 with HAUSP (1nbp8A) shows that 1. the fingertip region between 281-296 needs fixing. I think undertaker is trying too hard to make 282-290 helical. 2. The region 352-427 is treated as an insertion within a loop region (~441-445 in 1nb8). This places T0235's His-box (~430-450) in the right place for the active site. Now... how to pack this helical region? 3. We don't have a match to 1nb8's C-terminus (1nb8 res 522-554). Perhaps our insertion shoud follow it... they're in the same region, though our insertion is longer. Aug 4 6:00pm Jenny Draper I'm running a try2 from alignments, using essentially the same settings, except I'm including all the alignments to 1nb8A. Also I've upped the break cost, upped hbond geom, and turned down predicted secondary-structure and phobic_fit costs. I'm hoping this will help produce better structures in the problem regions mentioned above. Th Aug 5 12:00pm Jenny Draper Try2 doesn't look like it came up with anything better, and the insertion is still flopping out in space. I'll have to set up some constraints to hold the basic shape, and then start polishing the regions that need some help. Fri Aug 6 4:00pm Jenny Draper I really have no idea what direction to take this in... I'm gonna run a try3 with the sheet & helix constraints from try1, and hope it can do some loop packing... Sat Aug 7 13:43:24 PDT 2004 Kevin Karplus try3 is still running (so it must have too many iterations for such a large protein). The rr constraints look pretty good in try2-opt2, except for F111-Y430 and N116-S259, which suggests a different placement for the residues up to N116, coming in on the other side of the sheet. The try3 costfcn favors try1 over try2, and try3 is just a polishing of try1. There is no T0235.t04.many.frag.gz file, so I'll create one. Perhaps the ubiquitin-like domain should be packed where ubiquitin is in the 1nbf structure? That is, if we align T0235 to 1nbfA and 1nbfD, and cut-and-paste the pieces we may get a structure that makes sense. (Of course, that assumes a monomeric structure---domain swapping could be happening to get a multimeric structure.) I'll add 1nbfA and 1nbfD to the MANUAL_TOP_HITS, so that we can make extra_alignments to get alignments for them. Sat Aug 7 2:10pm Jenny Draper I really don't think the ubiquitin-like domain should pack into the ubiquitin binding site. This protein has to function as an ubiquitin hydrolase (ie, it needs that site open) -- the ubiquitin- like domain should anchor the protein into the proteasome. Sat Aug 7 14:08:36 PDT 2004 Kevin Karplus One possibility is that when the protein is alone it binds its ubiquitin-like domain, but when it is at the proteasome, the proteasome binds the domain, opening up the binding pocket for ubiquitin. It looks like I'll have to add 1nbfA and 1nbfD to the template library to get any decent alignments. This might take a while. Sat Aug 7 2:15pm Jenny Draper True. I had thought of that. I think the protein is functional in the absence of the proteasome though; I'll check the lit. It's probably worth having that structure as one or two of our models. Sat Aug 7 3:15pm Jenny Draper Yep, I've found experimental evidence that purified Ubp6 has ubiquitin-hydrolyzing activity, as does Ubp6 w/o the ubiquitin- like domain. So if the Ub-like domain does sit in the active site of Ubp6, it sure is easy to get it out of the way... Mol Cell. 2002 Sep;10(3):495-507. Multiple associated proteins regulate proteasome structure and function. Leggett DS, Hanna J, Borodovsky A, Crosas B, Schmidt M, Baker RT, Walz T, Ploegh H, Finley D. Sun Aug 8 07:35:24 PDT 2004 Kevin Karplus I aligned try3-opt2 with 1nbfA and 1nbfD, and took the first 99 residues from the alignment to 1nbfD and the rest from the alignment with 1nbfA. The result is in decoys/docked-chimera.pdb This model scores poorly, because of the bad break at Q99-Q100, because of clashes, and because of the helix constraint for A96-Q103. I wonder if those problems can be fixed without undocking the first domain. I'll try that for try4. I added some arbitrary constraints to hold the docked domain in place, then tweaked the cost function until the docked-chimera barely scored best. Sun Aug 8 20:02:22 PDT 2004 Kevin Karplus try4 is STILL running. We'll have to remember to reduce the number of iterations for future runs on T0235. From learithe@soe.ucsc.edu Mon Aug 9 13:42:36 2004 MIME-Version: 1.0 Date: Mon, 9 Aug 2004 13:42:35 -0700 (PDT) From: Jenny Draper To: Kevin Karplus Subject: T0235 inserted domain In-Reply-To: <200408082132.i78LWZw7000893@cheep.cse.ucsc.edu> I believe the inserted domain in T0235 is between Pro347 and Pro246 -Jenny Mon Aug 9 15:13:14 PDT 2004 Kevin Karplus That can't be right. I think Jenny meant P347-P426. I've set up 347-426 as a subdomain. I modified the try1.costfcn before starting, so that the two prolines were constrained to have the same orientation and spacing as in the try4-opt2 model, which should make pasting the result back in a bit easier. Mon Aug 9 17:45:11 PDT 2004 Kevin Karplus try1 of 347-426 looks ok, but I'm doing another run to see if I can make it a bit more compact. Then Jenny should create a chimera by pasting it into try4-opt2, and reoptimize with a cost function that has constraints turned way down. We probably can't afford to turn constraints off entirely, as the unconstrained.costfcn barely scores try4-opt2 better than try3-opt2, and the extra breaks or clashes that the chimera will have will probably make it look slightly worse. It will probably be necessary to tweak the next costfcn in order to make the chimera barely look better than try4-opt2. Actually, while we are waiting for the subdomain to finish building, I might as well try polishing try4 with the unconstrained cost fcn. Of course, this may move the two prolines that the subdomain is expecting to link to, but we can either re-optimize the subdomain for the new position of the prolines, or if they don't move much, just link in the subdomain and let optimization try to close the gaps. Mon Aug 9 23:21:42 PDT 2004 Kevin Karplus try5-opt1 only recently finished---it will probably take the rest of the night for try5-opt2 to finish. I migt as well pick up new edge constraints for the subdomain from try5-opt1 though, and apply them to the subdomain. P347.N P426.N 13.372 P347.CG P426.CG 13.780 P347.CA P426.CA 12.698 P347.O P426.O 12.820 P347.C N425.C 11.663 E348.CA N425.CA 11.540 E348.C N425.N 10.110 Note: these residues do not seem to have moved significantly between try4-opt2 and try5-opt1, so I don't expect much motion for try5-opt2 either. With the extra constraints, the ends should be rigid enough that it should be easy to superimpose the subdomain and try5-opt2: just gut try5-opt2 by removing 349-424, then superimpose the two incomplete conformations. Since the only residues they share are 347-348 and 425-426, the superposition should put the subdomain precisely where it is wanted. Then cut-and-paste to remove the extra residues. Both try3 on the subdomain and try5 should be done in the morning. Tue Aug 10 11:25:47 PDT 2004 Kevin Karplus gutted-try5.pdb.gz is try5-opt2 with residues 349-424 removed. Superimposing the 347-426/decoys/T0235.try3-opt2.pdb with gutted-try5.pdb.gz looks really terrible--the subdomain is sticking way into the main body (try5-overlapping-chimera.pdb). Superimposing 347-426/decoys/T0235.try3-opt2.pdb with decoys/T0235.try5-op2.pdb (try5-opt2-plus-sub.pdb) produces more modest clashes, and some bad breaks, but may be fixable with some opt and jiggle segment operations and the gap-closing operators. At some point I'm going to have to give undertaker the ability to handle frozen atoms, so that it can do optimization of a subdomain like this in the presence of an unchanging environment. I cut-and-pasted the model to make decoys/try5-chimera.pdb, which scores very badly with the try5 costfcn (almost as badly as "docked-chimera"). I'll start a shorter run for try6 to optimize just the try5-chimera. Tue Aug 10 20:57:03 PDT 2004 Kevin Karplus try6 is junk---the added helices all scattered. I think we should submit try5-opt2 best with try6 (unconstrained) costfcn try3-opt2 best before trying to dock ubiquitin-like domain into binding pocket try1-opt2 fully automatic run T0235-1nb8A-t2k-local-str2+CB_burial_14_7-0.4+0.4-adpstyle5 T0235-1ogwA-t04-local-str2+CB_burial_14_7-0.4+0.4-adpstyle5 The two alignments are for the two domains. I'll set this up and submit it tonight. We can resubmit if Jenny finds something better tomorrow. Fri Sep 24 21:24:55 PDT 2004 Kevin Karplus Evaluating with smooth GDT we get name length missing_atoms rmsd rmsd_ca GDT smooth_GDT model3.ts-submitted 499 0.0000 8.0346 7.2481 -50.6831 -47.8390 full-auto model2.ts-submitted 499 0.0000 8.0179 7.2024 -50.6831 -47.5765 try3 model4.ts-submitted 499 1727 4.3967 3.6061 -50.3415 -46.0765 alignment model1.ts-submitted 499 0.0000 8.5740 7.8166 -48.4973 -45.1164 try5 robetta-model1.pdb.gz 499 0.0000 22.4770 22.1644 -41.0519 -38.8299 robetta-model5.pdb.gz 499 0.0000 21.9268 21.9610 -40.8470 -38.6745 robetta-model3.pdb.gz 499 0.0000 28.1988 28.0913 -40.9153 -38.4555 robetta-model2.pdb.gz 499 0.0000 22.4606 22.4694 -39.5492 -37.2889 robetta-model4.pdb.gz 499 0.0000 24.9812 24.9039 -39.0027 -37.0565 model5.ts-submitted 499 3538 0.0000 0.0000 0.0000 0.0000 alignment 1vjvA is incomplete, and model5 has no overlap with the solved part. Our best model is the full auto one! At least we beat robetta. Fri Nov 26 11:06:48 PST 2004 Kevin Karplus Domain : T0235_1 : CM/easy : NT=309 : 107-356,427-499 Domain : T0235_2 : FR/A : NT=43 : 357-415 #Target best best model1 auto align robetta robetta # sam-t04 submit best 1 T0235 47.8390 47.8390 45.1164 47.8361 46.5762 38.8299 38.8299 T0235_1 55.9401 55.9391 53.3912 55.9401 54.7818 46.7985 46.7985 T0235_2 38.1419 38.1419 38.1419 37.9981 0.0000 44.8487 41.4253 The crystal structure doesn't resolve all of domain 2. We have ok values for it only because we have the first and last helix constrained by the comparative modeling domain (which we did fairly well on). We did not make the last helix of the domain long enough, though we did have it predicted to be longer than we made it. I made 347-426/decoys/evaluate_2.rdb to see how well we did on the inserted domain. Our best result was for our first alignment, to 2occJ, which we then proceeded to mess up. Luckily, we did not end up submitting any of the predictions with the messed up domain 2 prediction. We can't tell for sure whether Jenny was right about the ubiquitin-like domain not being in the binding pocket---it was not part of the crystal. I don't know if it was cut off, or if it was flopping around and so not solved. Since it wasn't consistently in the binding pocket, I suspect that Jenny was right.