2 June 1998 Kevin Karplus T59 is not homologus to known structures, and no PDB files appear in the target98 alignments. The top scores searching with the target98 alignments are 1dbp, 1drj, 1drk, 2dri (all at -4.520) 2dri is FSSP rep and 1sip (at -3.260). (1fmb is FSSP rep) Many of the FSSP neighbors of 2dri are sugar-binding proteins, though there are some other functions included as well---the fold seems to be used for a lot of different purposes. Top score with a library model is 1pk4 at -6.640 (fssp rep 5hpgA, which did NOT score well). Summing both directions doesn't improve any scores enough to make any changes. Top sccore of -6.640 implies domains 7T 12F 63% new (or 31T 50F 61% new) fssp 33.03T 50F 60% new 4 June 1998 Kevin Karplus The 1pk4-t59-global alignment doesn't look too terrible as an alignment (haven't check in 3D yet). 19 August 1998 Christian Remade with newest Makefile. 20 August 1998 Christian Now there's a clear structural signal for this target, confirming what was previously hinted at for this target. t59-sum98.rdb basically reiterates what all of the other rdb files report: TARGET HMM SCORE FSSP REP %IDE 10S 6S 10N t59 1krn -10.470 5hpgA 54% t59 5hpgA -10.340 5hpgA 100% t59 1kdu -10.270 5hpgA 39% t59 1pmlA -10.230 5hpgA 45% t59 1pk4 -10.220 5hpgA 54% t59 1dxgA -5.77 1dkgA 100% t59 1dbp -5.130 2dri 99% t59 1drj -5.130 2dri 99% Note that while all of blast's top hits are structures in the same FSSP family, T0059 1ceaA -3.912023 T0059 1ceaB -3.912023 T0059 1cebA -3.912023 T0059 1cebB -3.912023 T0059 1hpj -3.912023 T0059 1hpk -3.912023 T0059 1pkr -3.912023 double-blast comes up with nothing. I'm currently building the missing/incomplete model library directories. Thu Aug 20 11:40:22 PDT 1998 Kevin Karplus Here are the top current alignments (may need rechecking constrained alignments are built): 5hpgA/5hpgA-t59-global T0059 75 -9.49 -11.87 5hpgA/5hpgA-t59-post T0059 75 -9.49 -11.87 5hpgA/5hpgA-t59-fssp-global T0059 75 -9.37 -11.50 1kdu/1kdu-t59-global T0059 75 -8.75 -11.34 1kdu/1kdu-t59-post T0059 75 -8.75 -11.34 1pk4/1pk4-t59-global T0059 75 -9.54 -11.28 1pk4/1pk4-t59-post T0059 75 -9.54 -11.28 5hpgA/5hpgA-t59-vit T0059 75 -13.23 -11.23 1pmlA/1pmlA-t59-global T0059 75 -8.43 -10.95 1pmlA/1pmlA-t59-post T0059 75 -8.43 -10.95 1krn/1krn-t59-vit T0059 75 -12.24 -10.83 1ceaA/1ceaA-t59-vit T0059 75 -12.24 -10.80 1ceaA/1ceaA-t59-global T0059 75 -8.55 -10.73 1ceaA/1ceaA-t59-post T0059 75 -8.55 -10.73 1krn/1krn-t59-global T0059 75 -9.09 -10.71 1krn/1krn-t59-post T0059 75 -9.09 -10.71 1pmlA/1pmlA-t59-vit T0059 75 -12.26 -10.64 1kdu/1kdu-t59-vit T0059 75 -12.71 -10.59 1pk4/1pk4-t59-vit T0059 75 -12.47 -10.25 1ceaA/t59-1ceaA-vit 1ceaA 80 -3.56 -4.77 1ceaA/t59-1ceaA-global 1ceaA 80 0.70 -4.65 1ceaA/t59-1ceaA-post 1ceaA 80 0.70 -4.65 1pmlA/t59-1pmlA-vit 1pmlA 86 -1.86 -2.51 1krn/t59-1krn-vit 1krn 79 -1.77 -2.27 I modified 5hpgA-t59-fssp-global slightly to get 5hpgA-t59-hand1, which looks pretty good. 20 August 1998 Christian I have been looking at the various alignments and have noticed that t59 does not align its first 19 residues. It also does not align to the last strand and a half of any of the structures. I suspect that the first 19 residues are probably the cellular targeting signal and may not be part of the mature protein. My recollection is that most of them are cleaved off once the protein has been taken to its proper cellular location. That the last strand and a half of the structure are not aligned to is not so discouraging since we know that t59 is the N-terminal portion of SMD3_HUMAN. So I think it is probably the case that the following portions of SMD3_HUMAN fill in the critical beta structure. (It would be interesting to do a 2ary structure prediction for all of SMD3_HUMAN and see if the following portion is indeed predicted to be beta.) If this conjecture is true, I doubt that the t59 sequence maintains the same shape alone as it does when part of the whole protein. I have an alignment that requires less molesting than 5hpgA-t59-hand1, 5hpgA-t59.cbarrett1.a2m. But better, I think, than either of the 5hpgA alignments is 1kdu-t59.cbarrett1.a2m. A competing alignment that conserves essentially the same residues but does differ in residue composition is t59-1pmlA.cbarrett1.a2m Then there's 1krn-t59.cbarrett1.a2m and 1ceaA-t59.cbarrett1.a2m. These alignments are quite similar. The difference between the alignments are subtle, as is expected since the structures are similar: STRID1 STRID2 Z RMSD LALI LSEQ2 %IDE 5hpgA 1ceaA 16.6 0.7 79 80 58 5hpgA 1pk4 16.3 0.9 79 79 54 5hpgA 1krn 16.2 0.9 79 79 54 5hpgA 5hpgA 15.7 0.0 72 84 100 5hpgA 1pmlA 12.7 1.5 77 86 45 5hpgA 1kdu 7.7 2.6 76 85 39 Why 5hpgA shouldn't be the most similar to itself, I don't know. Fri Aug 21 13:54:16 PDT 1998 Kevin Karplus Doing search with smd3_human as target in subdirectory smd3_human: wu-blast weakly finds 1hp[jk] -3.58 (why not 1ceaA, which is nearly identical?) double-blast finds nothing. The predicted secondary structure is >smd3_human CCCCCCHHHHHHHCCCEEEEEEECCCEEEEEEECCCHCCEEEEEEEEECCC CCCCCECCEEEEEECEEEEEEECCHCCCCHHHHHCCCCCCCCCCCCCCEE CCCCCCCCCCCCCCCCCCCCCCCCC which does not seem to have much beta structure where Christian wants it. Here are the best alignments, based on the templates we were looking at before: 5hpgA/5hpgA-smd3_human-global smd3_human 126 -10.55 -12.82 1kdu/1kdu-smd3_human-global smd3_human 126 -9.38 -11.81 5hpgA/5hpgA-smd3_human-fssp-global smd3_human 126 -10.05 -11.76 1pk4/1pk4-smd3_human-global smd3_human 126 -9.90 -11.50 1ceaA/1ceaA-smd3_human-global smd3_human 126 -9.30 -11.32 1krn/1krn-smd3_human-global smd3_human 126 -9.77 -11.30 1pmlA/1pmlA-smd3_human-global smd3_human 126 -8.96 -11.26 5hpgA/5hpgA-smd3_human-vit smd3_human 126 -12.71 -11.23 1krn/1krn-smd3_human-vit smd3_human 126 -11.72 -10.83 1ceaA/1ceaA-smd3_human-vit smd3_human 126 -11.72 -10.80 1pmlA/1pmlA-smd3_human-vit smd3_human 126 -11.74 -10.64 1kdu/1kdu-smd3_human-vit smd3_human 126 -12.19 -10.59 1pk4/1pk4-smd3_human-vit smd3_human 126 -11.96 -10.25 There are now smd3_human/5hpgA/5hpgA-smd3_human-hand1 smd3_human/1kdu/1kdu-smd3_human-hand1 smd3_human/1pk4/1pk4-smd3_human-hand1 smd3_human/1ceaA/1ceaA-smd3_human-hand1 Of these, I like 5hpgA best, though 1kdu is possible. One problem: there are only 2 cys residues in smd3_human, and they are conserved in the various alignments, but in the templates they both have sulphur bridges to other cys residues, which are not present in smd3_human. It seems unlikey that this rather contorted structure could be stable when missing the three sulphur bridges that hold it together. The top-scoring sequences for smd3_human.t98_6 are smd3_human 1dbp -5.900 2dri smd3_human 1drk -5.330 2dri smd3_human 1urp[ABCD] -5.320 ? smd3_human 2dri -5.320 2dri smd3_human 1drj -5.290 2dri smd3_human 1sip -4.370 1fmb smd3_human 1ba2[AB] -4.360 ? 2dri does not do too badly: 2dri/smd3_human-2dri-global 2dri 271 -9.83 -11.91 This is a very different alignment than the 5hpgA (and homologs) ones. It does have a nice conservation pattern at one end of the beta sheet. I fussed around to create smd3_human-2dri-hand1, but I'm not sure it's really any better than the automatic alignment. Best with the target98 library are still the 5hpgA homologs: 1krn -10.190 5hpgA 5hpgA -10.020 5hpgA 1kdu -9.910 5hpgA 1pk4 -9.910 5hpgA 1pmlA -9.900 5hpgA 1dxgA -4.130 1dxgA 1lxdA -3.950 1lxdA 1pjr -3.760 1pjr Summing both ways: 1krn -10.190 5hpgA 5hpgA -10.020 5hpgA 1kdu -9.910 5hpgA 1pk4 -9.910 5hpgA 1pmlA -9.900 5hpgA 1dbp -5.900 2dri 1drk -5.330 2dri 1urp[ABCD] -5.320 ? 1drj -5.290 2dri 1dxgA -5.2 1dxgA 2dri -5.05 2dri 1sip -4.370 1fmb 1ba2[AB] -4.360 ? 1az5 -4.170 ? 1yt[ij]A -4.170 1fmb Fssp gets a few good hits: 5hpgA -11.790 1pdnC -7.170 1dhpA -6.030 1pov1 -5.530 1iceB -5.420 1ife -4.810 1ctf -4.690 Here are the top alignments: 5hpgA/5hpgA-smd3_human-global smd3_human 126 -10.55 -12.82 2dri/smd3_human-2dri-global 2dri 271 -9.83 -11.91 1kdu/1kdu-smd3_human-global smd3_human 126 -9.38 -11.81 5hpgA/5hpgA-smd3_human-fssp-global smd3_human 126 -10.05 -11.76 1pk4/1pk4-smd3_human-global smd3_human 126 -9.90 -11.50 1ceaA/1ceaA-smd3_human-global smd3_human 126 -9.30 -11.32 1krn/1krn-smd3_human-global smd3_human 126 -9.77 -11.30 1pmlA/1pmlA-smd3_human-global smd3_human 126 -8.96 -11.26 5hpgA/5hpgA-smd3_human-vit smd3_human 126 -12.71 -11.23 1krn/1krn-smd3_human-vit smd3_human 126 -11.72 -10.83 1ceaA/1ceaA-smd3_human-vit smd3_human 126 -11.72 -10.80 1pmlA/1pmlA-smd3_human-vit smd3_human 126 -11.74 -10.64 1kdu/1kdu-smd3_human-vit smd3_human 126 -12.19 -10.59 1pk4/1pk4-smd3_human-vit smd3_human 126 -11.96 -10.25 1dxgA/1dxgA-smd3_human-vit smd3_human 126 -5.45 -6.21 1dxgA/1dxgA-smd3_human-fssp-global smd3_human 126 -6.80 -5.36 1dxgA/1dxgA-smd3_human-global smd3_human 126 -5.82 -4.86 2dri/smd3_human-2dri-vit 2dri 271 -4.73 -4.86 1pdnC/1pdnC-smd3_human-fssp-global smd3_human 126 -4.68 -4.76 1ceaA/smd3_human-1ceaA-global 1ceaA 80 -1.63 -4.63 1fmb/1fmb-smd3_human-fssp-global smd3_human 126 -1.63 -4.08 1krn/smd3_human-1krn-vit 1krn 79 -1.71 -3.08 1pk4/smd3_human-1pk4-vit 1pk4 79 -1.71 -3.08 1dxgA/smd3_human-1dxgA-vit 1dxgA 36 -2.12 -3.02 23 August 1998 Christian Creating t59.remote_4-t98-mixed.rdb, since it contains some remote homologs that are also mentioned in literature. 24 August 1998 Christian PMID:9417867 finds that sm-D3 forms a complex with sm-B. The directory smB contains a target search for this protein. By a long-shot, maybe it will be similar to the portion of 3dri that we are not aligning this target to. double-blast smB 1thm -4.13517 2.3e-05 0.016 gi|553653_30:110 Zscore of 7.6 to 2dri and 13 %IDE over 129 residues While I could find nothing that smB liked in the 2dri FSSP file, I could find a few things that came up in the 5hpgA FSSP file: smB-t98.rdb:5hpgA smB varh50 0 2 -1.620 smB-t98.rdb:1pk4 smB varh50 0 2 -1.610 smB-t98.rdb:1krn smB varh50 0 2 -1.690 smB-t98.rdb:1pmlA smB varh50 0 2 -1.550 smB.t98_6-varh50-pdb.rdb:smB 2hpqH varh50 0 2 -1.690 smB-t98.rdb:1kdu smB varh50 0 2 -1.620 From karplus@cse.ucsc.edu Tue Aug 25 09:25:10 1998 Return-Path: karplus@cse.ucsc.edu Date: Tue, 25 Aug 1998 09:25:09 -0700 From: Kevin Karplus To: cbarrett@cse.ucsc.edu CC: karplus@cse.ucsc.edu Subject: 5hpgA and cystines I looked at 5hpgA.target98-pdb to see if there were any homologs that did not have the 6 cystines that I thought were essential to stabilizing the kringle. In the approximately 400 sequences, not one had a substitution in any of the 6 places. A few had gaps, but these seemed to be either fragments or misalignments (the first C is very near the beginning, and sometimes was eaten by the initial FIM). So, I'm still convinced that those cystines are essential to the fold. Kevin