Thu Jul 13 09:17:53 PDT 2006 T0375 Make started Thu Jul 13 09:18:51 PDT 2006 Running on lopez.cse.ucsc.edu Thu Jul 13 15:30:01 PDT 2006 Kevin Karplus >Blast gets strong full-length hit to 1tyyA (25% over 308 residues, 1.3e-06) HMMs get very strong hits to 1rkd and other c.72.1.1 domains. We may have to do some tweaking of undertaker scripts to pull out the best template, as there are many to choose among. Wed Jul 19 17:22:13 PDT 2006 Chris Wong Zack and I have decided to do a polishing run on this one before we tetramerize it. We plan to use 1rk2 or 1gqt to do the tetramerization. The polishing run, try2, started on lopez at 1725. ( make -k T0375.do2 > & do2.log ; gzip -9f do2.log ) & Date: Thu, 20 Jul 2006 12:11:13 -0700 From: Kevin Karplus To: chrisw CC: jsanborn Subject: Re: T0375 The top-scoring template with the HMMs is the one that fits the HMM model best---it is not necessarily the one closest to the target. We have at least 20 templates with excellent HMM scores, but the top BLAST hits are not the top HMM hits: chain BLAST_rank HMM_rank 1tyyA 1 12 1v1aA 2 3 1vm7A 3 10 2fv7A 4 2 1rkd 5 1 2absA 6 9 1dgmA 7 18 1lioA 8 31 1liiA 9 13 2afbA 10 4 2dcnA >95 5 1bx4A 16 6 There are different subfamilies in this list, and getting the right subfamily probably makes a difference. Most likely, we will want to use a template that scores well with both methods: 1v1aA, 2fv7A, 1rkd The c.72.1.1 family looks like a better fit than the other c.72.1.* families. ------------------------------------------------------------ Fri Jul 21 11:04:29 PDT 2006 Chris Wong In the process of making a tetramer using 1rk2 as the template. We ran into the problem where 1rk2 is not in /pce/models.96/pdb. There's got to be a way to make this alignment manually. Fri Jul 21 12:54:07 PDT 2006 Kevin Karplus 1rk2 is the identical sequence to 1rkd, which hides it. Remove the sudirectory 1rk2 and all it contains. Actually, you won't be able to find 1rk2A in the reduced PDB dataset either, because it has an identical sequence to 1rkd, which Dunbrack's algorithm chose as the preferred representation. If you are just looking for an alignment to make into a multimer alignment, use the 1rkd alignment---for that matter, build the 1rkd dimer first, before trying to build a tetramer (assuming that the 1rkd dimer is compatible with the 1rk2[ABCD] tetramer). I've fetched 1rkd/1rkd.mmol and 1rkd/1rk2.mmol from PQS for you, and created 1rkd/T0375-1rkd.dimer-a2m and 1rkd/T0375-1rk2.4mer-a2m from T0375-1rkd-t2k-local-str2+near-backbone-11-0.8+0.6+0.8-adpstyle5.a2m This is *not* a cyclic 4mer, but 2 dimers that stack in an infinite chain. It is best optimized by optimizing one dimer (say based on 1rkd.mmol), then superimposing the dimer twice on the 1rk2.mmol file. Fri Jul 21 15:42:54 PDT 2006 Chris Wong Okay.. I need to check the README more often. Prof. Karplus has come to the same plan of action for this target that I have. Step one is to get the dimer built and optimized. Then, we'll stick two of the dimers end to end. I think I've managed to set up a dimer. Now I'll try my luck by starting it on lopez at 1544. ( make -k T0375.do1 > & do1.log ; gzip -9f do1.log ) & in T0375/dimer/ Fri Jul 21 22:06:40 PDT 2006 Chris Wong dimer/try1 looks HORRIBLE. I'm pretty sure I did something wrong... though I'm not exactly sure what it was. Mon Jul 24 13:58:47 PDT 2006 Chris Wong (sent an email to Prof. K) Date: Mon, 24 Jul 2006 13:56:09 -0700 (PDT) From: Chris Wong To: Kevin Karplus Cc: Zack Sanborn Subject: problem making dimer for T0375 Prof Karplus. We have been trying to make a dimer for T0375 as the initial step in producing a tetramer model. We've followed the steps in the casp7/README. The resulting models for this try1, in /projects/compbio/experiments/protein-predict/casp7/T0375/dimer/decoys, look really bad. In troubleshooting, we found out that I had in the make-dimer.under file a 4 where a 2 should be. This error was corrected, and we tried to do undertaker < make-dimer.under to see if it works. Undertaker stopped, with an error: 1rkdA expands to /projects/compbio/data/pdb/1rkd.pdb.gz 1rkdA:undertaker: SpecificFragmentCommands.cc:698: int BurialReadFragmentAlignment(std::istream&, Command*, std::ostream&): Assertion `bad_templ' failed. The rest of the log is in: /projects/compbio/experiments/protein-predict/casp7/T0375/make-dimer.log How can we fix this error? Chris Mon Jul 24 15:12:17 PDT 2006 Kevin Karplus The problem was that 1rkd is only a monomer in the unit cell. You need to read the dimer from the PQS 1rkd.mmol file, which I had already downloaded to the 1rkd directory. I fixed the make-dimer.under file and have made the dimer. Mon Jul 24 16:00:13 PDT 2006 Zack Sanborn Thanks Kevin, your explanation makes sense. I've started our first optimization of the dimer on lopez at 16:00, called try2. Obviously, all try1-* stuff will have to be ignored as it was not doing anything of what we wanted (making a dimer from the initial alignments). However, I kept it since I felt overwriting the files it created was the wrong thing to do. Mon Jul 24 19:06:43 PDT 2006 Zack Sanborn dimer/try2 has finished and, what do you know, it scores better than the terrible try1-opt2. I'm going to do that unpack trick (once I remember how to do it), to compare the monomeric model from this dimer with the "pure" monomers. Mon Jul 31 17:15:53 PDT 2006 Chris Wong Okay, I'm going to try making a tetramer out of one of the dimers. This tetramer is not cyclic. It is basically has four pieces that are stacked up upon each other in a sort of rod configuration. Mon Jul 31 23:46:48 PDT 2006 Chris Wong Finished making a tetramer. It is /T0375/4mer/decoys/4mer-1rk2.pdb. This one is made from 1rk2, which is identical to 1rkd, except that it is a tetramer. Tue Aug 1 09:24:31 PDT 2006 Chris Wong Not exactly sure whether it is better to make a better monomer and then turn it into a tetramer, or if it is better to just work on the tetramer from now on. I have a feeling it's better to work on the monomer, first. We should try to limit work on the tetramer to polishing runs to get the interfaces right. Monomer runs are shorter, anyway. Tue Aug 1 10:20:10 PDT 2006 Chris Wong I am going to look into building a monomer based on 1v1a. I used the clutalw website to do some seq alignments of T0375 against each of 1v1aA, 2fv7A, and 1rkd (c.72.1.1). 1v1a has the best alignment score. I'll use that one. Tue Aug 1 12:51:41 PDT 2006 Chris Wong Okay, I've setup try3 to use the three alignments for 1v1aA, 2fv7A, and 1rkd (c.72.1.1). ( make -k T0375.do3 > & do3.log ; gzip -9f do3.log ) & started on lopez at 1256. Tue Aug 1 14:20:10 PDT 2006 Chris Wong I'm looking at some more of the best hits in T0375.best-scores.html. There's a tetramer(1vi9) that looks oriented slightly different from 1rk2. 1vi9 is in the family, c.72.1.5, as is 1td2A. Maybe I make a little table. family id chains in PDB ============================================= c.72.1.1 1v1a 2 ? 2fv7 2 c.72.1.1 1rkd 1 c.72.1.1 1rk2 4 c.72.1.1 1dgm 1 c.72.1.1 1lio 1 ============================================= c.72.1.5 1vi9 4 c.72.1.5 1td2 2 c.72.1.5 1lhp 2 ============================================= Tue Aug 1 15:23:47 PDT 2006 Chris Wong try4 is going to use just the alignments from c.72.1.5 to try to get a different monomer. ( make -k T0375.do4 > & do4.log ; gzip -9f do4.log ) & started on vashon at 1527. Tue Aug 1 17:15:25 PDT 2006 Chris Wong Zack and I are going to build a tetramer to optimize. However, after running the make-4mer.under script on try3-opt2, we noticed a knot at the interface between chains. We used proteinshop to remove the knot. Now, we'll polish this tetramer as a try1 and see what happens. ( make -k T0375.mult1 > & mult1.log ; gzip -9f mult1.log ) & started on lopez at 1730. Tue Aug 1 18:55:17 PDT 2006 Chris Wong try5 will be an optimization on try2-opt2. ( make -k T0375.do5 > & do5.log ; gzip -9f do5.log ) & started on lopez at 1910. Tue Aug 1 21:46:12 PDT 2006 Chris Wong The 4mer, try1-opt2 has been destroyed by undertaker. I get the feeling that undertaker is trying to form a cyclic multimer out of our tetramer models. Unfortunately, that's not what we want. Prof. Karplus has mentioned that one way to prevent undertaker from doing this is to have long range distance constraints. This is not a good thing, as we have little time to come up with, and optimize for such constraints. It's also not clear to me how effective that technique would be. I think our best bet right now is to come up with good monomers, rather than try to have undertaker tear apart our tetramers in optimization runs. Just to see, I'll run one more 4mer optimization on T0375.try4-opt2.4mer-1vi9.pdb. This will be the 4mer try2. ( make -k T0375.mult2 > & mult2.log ; gzip -9f mult2.log ) & started on orcas at 1026p. Tue Aug 1 22:38:20 PDT 2006 Chris Wong try5-opt2 scores the highest of the monomers. However, there are some breaks and some helices that I think need to be formed. Going to run again with more constraints and heavier break penalty, as monomer try6. ( make -k T0375.do6 > & do6.log ; gzip -9f do6.log ) & started on orcas at 1047p. Tue Aug 1 22:49:24 PDT 2006 Chris Wong Decided to also try the gromacs trick to fix breaks. This will be try7. ( make -k T0375.do7 > & do7.log ; gzip -9f do7.log ) & started on lopez at 1104p. going to polish monomer try3 (as try8) and try4 (as try9) ( make -k T0375.do8 > & do8.log ; gzip -9f do8.log ) & started on lopez at 1117p. ( make -k T0375.do9 > & do9.log ; gzip -9f do9.log ) & started on shaw at 1120p. +++monomers 1+++ alignments -> mono try1 -> mono try2 -> mono try5 -> mono try6 +++monomers 1.5+++ T0375.try5-opt2.gromacs0.repack-nonPC.pdb.gz -> mono try7 +++monomers 2+++ 1v1aA, 2fv7A, and 1rkd -> mono try3 -> mono try8 +++monomers 3+++ 1vi9A, 1td2A, and 1lhpA -> mono try4 (gromacs) -> mono try9 ===4mer 1=== try3 -> T0375.try3-opt2_ps.4mer-1rk2.pdb -> 4mer try1 ===4mer 2=== try4 -> T0375.try4-opt2.4mer-1vi9.pdb -> 4mer try2 Wed Aug 2 08:11:11 PDT 2006 Chris Wong Going to make some tetramers out of the monomers made last night. mono try6 -> T0375.try6-opt2.4mer-1rk2.pdb (make-4mer_3.under) mono try7 -> T0375.try7-opt2.4mer-1rk2.pdb (make-4mer_4.under) mono try8 -> T0375.try8-opt2.4mer-1rk2.pdb (make-4mer_5.under) mono try9 -> T0375.try9-opt2.4mer-1vi9.pdb (make-4mer_6.under) Wed Aug 2 10:34:39 PDT 2006 Chris Wong I want to do a monomer try10 starting from alignments, but using constraints from try6-opt2. I am hoping to get the same secondary structure as try6-opt2, but with the helices 120-129 and 168-182 settling in a position that doesn't have breaks. I'm not sure this is the best way, but it's easy to set up, so I'll do it. ( make -k T0375.do10 > & do10.log ; gzip -9f do10.log ) & started on shaw at 1043. try6 -> try10 Wed Aug 2 11:01:27 PDT 2006 Chris Wong monomer try11 will be an unconstrained optimization of try6-opt2 with heavy break penalty. monomer try12 will be a constrained optimization of try6-opt2 with heavy break penalty. ( make -k T0375.do11 > & do11.log ; gzip -9f do11.log ) & started on vashon at 1112. ( make -k T0375.do12 > & do12.log ; gzip -9f do12.log ) & started on camano at 1129. try6 -> try11 & try12 Wed Aug 2 11:51:22 PDT 2006 Chris Wong Now to write down some notes about the 4mer's that have been made so far. ===== T0375.try3-opt2.4mer-1rk2.pdb T0375.try3-opt2_ps.4mer-1rk2.pdb 4mer try1-opt2 These 4mers come from the mono try3. The monomer try3 started from 1v1aA, 2fv7A, and 1rkd alignments. These were the top scoring ones in T0375.best-scores.html. The first one, not proteinshopped, has a knot between the monomer units. ProteinShop was used to attempt to remove the knot. It still has a knot. Tried to optimize it, to maybe get some success. Undertaker *destroyed* the tetramer. It probably was trying to make a cyclic tetramer out of it. The tetramer template, 1rk2, is a tetramer with the units set end-to-end. ===== T0375.try4-opt2.4mer-1vi9.pdb ===== T0375.try6-opt2.4mer-1rk2.pdb T0375.try7-opt2.4mer-1rk2.pdb T0375.try8-opt2.4mer-1rk2.pdb ===== T0375.try9-opt2.4mer-1vi9.pdb This one uses 1vi9 for a tetramer template. It has some serious clashes between chains. The chains are oriented in a sort of strange way, too. We'll probably not be using 1vi9. Wed Aug 2 13:45:40 PDT 2006 Chris Wong the opt1's for the try10,11,12 all have breaks, and they're bad enough so that I don't think opt2 will close them up. So, I'm going to make a try13 that increases the undertaker commands that heal breaks. We'll start with T0375.try7-opt2.gromacs0.repack-nonPC.pdb because it has less severe breaks, while still looking decent. ( make -k T0375.do13 > & do13.log ; gzip -9f do13.log ) & started on lopez at 1411 Wed Aug 2 15:55:52 PDT 2006 Chris Wong Well, it looks like monomer try10,11,12 have not made any real improvements in breaks. Maybe try10 a little bit... not great, though. There's some hope with try13. try13-opt1 looks comparatively decent from a breaks point of view. Here is the current order of unconstrained scoring for monomer opt2's: model break score =============================================================== T0375.try12-opt2.pdb.gz 5.1 158.58 T0375.try13-opt2.pdb.gz 3.8 158.79 T0375.try11-opt2.pdb.gz 5.0 159.08 T0375.try7-opt2.pdb.gz 4.0 160.88 T0375.try6-opt2.pdb.gz 5.3 161.03 T0375.try8-opt2.pdb.gz 4.8 162.27 T0375.try5-opt2.pdb.gz 5.6 162.98 T0375.try2-opt2.pdb.gz 3.6 163.89 T0375.try10-opt2.pdb.gz 4.3 164.51 T0375.try3-opt2.pdb.gz 5.3 166.25 T0375.try1-opt2.pdb.gz 6.5 170.41 T0375.try9-opt2.pdb.gz 2.7 188.93 T0375.try4-opt2.pdb.gz 6.2 193.23 ================================================================ Here is the current order of increasing break score for monomer opt2's: model break score ================================================================ T0375.try9-opt2.pdb.gz 2.7 188.93 T0375.try2-opt2.pdb.gz 3.6 163.89 T0375.try13-opt2.pdb.gz 3.8 158.79 T0375.try7-opt2.pdb.gz 4.0 160.88 T0375.try10-opt2.pdb.gz 4.3 164.51 T0375.try8-opt2.pdb.gz 4.8 162.27 T0375.try11-opt2.pdb.gz 5.0 159.08 T0375.try12-opt2.pdb.gz 5.1 158.58 T0375.try3-opt2.pdb.gz 5.3 166.25 T0375.try6-opt2.pdb.gz 5.3 161.03 T0375.try5-opt2.pdb.gz 5.6 162.98 T0375.try4-opt2.pdb.gz 6.2 193.23 T0375.try1-opt2.pdb.gz 6.5 170.41 ================================================================ Looking that the all.breaks.gz file, the models that stand out with the fewest, and least severe breaks (in no particular order) are try7, 9, 13. try9 does well enough with breaks, but it is near the bottom in terms of unconstrained score. Also, it uses the 1vi9 template. Perhaps we should include this model, for variety in models. The tetramer it makes with 1vi9 template is not very good, though. The tetramer it makes with 1rk2 template is not any better. It has not clashes or knots, but they don't look like they're fitting tightly AT ALL. Our top monomers will be: Model 1 is T0375.try13-opt2.pdb. It is not the top scoring model with breaks or even the overall unconstrained cost function, but we believe that it is the best model that we have. Out of all the models that weren't based on the c.72.1.5 family, it does the best in terms of breaks. It has the second best unconstrained cost. This model started started from the automatic alignments, and went through several rounds of optimizing and polishing. The third and fifth rounds of optimization started with the gromacs models to remove breaks. Also, try5 added some helix constraints that joined some short, neighboring sections of predicted helices. Not only did try13 start with a gromacs model, but also the probabilities for the undertaker methods were increased for minimizing gaps and breaks. This monomer makes a nice-looking tetramer using 1rk2 for a template. Model 2 is T0375.try7-opt2.pdb. It scores decently with breaks as well as overall. It is based on the automatic alignments, with several rounds of optimizing and polishing. It also makes a decent-looking tetramer using the 1rk2 template, with nice interfaces between the chains (especially in the sheets). Model 3 is T0375.try12-opt2.pdb. This model is based on the same automatic alignments that T0375.try13-opt2.pdb started from. However, the optimization runs were slightly different. This model never used a gromacs starting model. Model 4 is T0375.try8-opt2.pdb. This model originated from the top three alignments in the automatic run (T0375.best-scores.html). The IDs of the chains were 1v1aA, 2fv7A, and 1rkd. This model underwent one round of polishing. Model 5 is T0375.try9-opt2.pdb. This model is different from most of the other ones we've been working on. It was made from alignments to 1vi9A, 1td2A, and 1lhpA. These are from the SCOP family called c.72.1.5. This model underwent one round of polishing from a gromacs starting point. Wed Aug 2 23:04:58 PDT 2006 Chris Wong I did the submissions for monomers and tetramers, except for tetramer contact predictions... I'm not sure if there is anything to submit for that. Here's a copy of e-mail I sent: From: "Chris Wong" To: "Kevin Karplus", "Zack Sanborn" , ... Subject: submitting T0375 Date: Wed, 2 Aug 2006 22:36:46 -0700 Hi Everybody. I have done the following submissions for T0375: 1) monomer contact predictions 2) monomer 3D files 3) tetramer 3D files Do we need to submit contact predictions for the tetramers? I got an error when I tried to do it. I have made some logs containing the output of the submissions. log for monomer: casp7/T0375/submit.log log for tetramer: casp7/T0375/4mer/submit.log log for error: casp7/T0375/4mer/submit_contacts_error.log Also, I was not able to check my submissions on the website because I don't have the group leader code. Somebody that has the code should probably check to make sure the files have been accepted. Chris Thu Aug 3 16:17:17 PDT 2006 Kevin Karplus Chris's 4mer submission failed, because he specified files like dimer5.ts: $(call modelfullname_to_ts,decoys/T0375.try9-opt2.4mer-1vi9.pdb.gz,5) but had no models with those names. He had incorrectly renamed them to have an extra "unpack", which meant that the unpack-multimer script was never run and so the models did not have 4 chains but only one long chain. I renamed without the unpack and ran the make dimer_modles and make email_dimers, which should be ok now. Thu Aug 3 20:45:48 PDT 2006 Kevin Karplus The scripts still fail, because Chris got the MONOMER_LENGTH wrong. It should have been 296, the length of the monomer, not 1184. Fri Aug 4 12:47:39 PDT 2006 Chris Wong I just checked the model list viewer website, and it appears all our submissions are on the list. T0375TS010_1 T0375TS010_2 T0375TS010_3 T0375TS010_4 T0375TS010_5 T0375TS010_1o T0375TS010_2o T0375TS010_3o T0375TS010_4o T0375TS010_5o T0375RR010_1