14 May 1998 Kevin Karplus t53 is CBIK_SALTY, and target98 finds NOTHING else in NRP as a reasonably close homolog. However, the remote homolog search of PDB gets a moderate hit (-4.670) on 20 sequences (FSSP representative 1djxB). This has some structure homologs that are not sequence homologs, so an fssp alignment for 1djxB would be worth building. Using the structure models, there are two stronger hits: 1pfkA (-7.370) and 1iceA (-4.830). Summing in both directions doesn't pull anything up higher than these. The best hit (-7.370) is in a range with 22 true positives and 50 false positives (69% new fold) for SCOP domains, and 33.75 true with 50 false (60% new fold). 28 July 1998 Christian Moved everything into the "old" directory and remade with newest Makefile. 18 August 1998 Kevin Karplus Moved everything into the "old2" directory and remade with newest Makefile. wu-blast finds nothing (top hit 1erd +1.8) double-blast finds nothing. t53.t98_6 finds 1djxB (and the identical sequences) quite strongly (-27.37). Next highest hit is weak (1div -5.83). The -27.37 score is in the range where fewer than 1% of the hits are false positives, -5.83 is in a range where 2/3 of the hits are false. Including hits from "old" and "old2" adds 1pfkA (found by old target98) and 1erd (included in remote_4 of old2). The best hits with library target98 models are 1pfkA -8.050 3pfk 3pfk -7.970 3pfk 4pfk -7.910 3pfk 1pauA -6.150 1pauA 1fvkA -5.720 1fvkA 1iceA -5.530 1pauA 1lci -5.180 1lci 1minB -4.900 3minB 3minB -4.850 3minB 3pgm -4.790 4pgmA 1dsbA -4.380 1fvkA 1ice_1a1 -4.300 1pauA 1cdg_4 -4.240 1pamA 1afwA -4.220 1afwA 1deaA -4.170 1deaA 1amy -4.160 1amy 1yasA -4.150 1yasA 1ppi_2 -4.130 1smd 1dsbA_2 -4.090 1fvkA 1amy_2 -4.070 1amy 1erd -4.070 1erd 1hpm_2 -4.050 1kay Summing both ways leaves 1djxB at the top, but moves up 1pauA: 1djxB -27.03 1djxB 1pauA -9.21 1pauA 1pfkA -8.050 3pfk [34]pfk -7.970 3pfk 1div -7.22 1div 1erd -6.77 1erd 1fvkA -5.720 1fvkA 1iceA -5.530 1pauA 1ak1 -5.51 1ak1 1ipsA -5.5 1ipsA FSSP Z-scores for a cluster of hits: 1pauA 3pfk 3.9 1pauA 1ak1 3.6 1pauA 1djxB 2.7 3pfk 1ak1 2.3 Since t53 is a cobalt chelatase, we should find a metal-binding site in it. Note 1ak1 is a ferrochelatase. Top scores from t53-t98-mixed library (using w0.5): [34]pfk -8.370 3pfk 1pamA -7.580 1pamA 1lci -6.460 1lci 2taaA -6.070 2aaa 1pauA -6.040 1pauA 2pfkD -5.680 3pfk 1cdg -5.670 1pamA 1jdbK -5.340 ? 1cgt -5.240 1pamA The top viterbi scores in the target98 library are 1smd -7.670 1smd 1amy -7.290 1amy 2aaa_2 -6.900 2aaa 2taaA -6.820 2aaa 1ppi_2 -6.810 1smd 1cdg_4 -6.610 1pamA 2aaa -6.600 2aaa 3pgm -6.600 4pgmA 1amg_2 -6.570 ? 1amy_2 -6.430 1amy 1lci -6.360 1lci 1gpc -6.320 1gpc 1vjs -6.170 1vjs 1pamA -6.080 1pamA 1poxA -5.920 1poxA 1pauA -5.880 1pauA t53.remote_4 finds 1djxB -49.030 (and other copies) 1seb[BF] -37.170 1iakB 1dlh[BE] -37.150 1iakB 1aqd[BEHK] -37.130 1iakB 1iakB -35.680 1iakB 1ie[ab][BD] -34.110 1iakB 2sebB -33.200 1iakB? 1a6aB -25.390 ? 2cr[st] -13.240 1tgxA Here are the best non-self alignment scores: 1djxB/t53-1djxB-vit 1djxB 561 -30.17 -30.58 1djxB/t53-1djxB-global 1djxB 561 -27.84 -27.98 1djxB/t53-1djxB-post 1djxB 561 -27.84 -27.98 1ak1/1ak1-t53-global T0053 264 -5.48 -13.15 1ak1/1ak1-t53-post T0053 264 -5.48 -13.15 1fvkA/1fvkA-t53-global T0053 264 -10.43 -12.50 1fvkA/1fvkA-t53-post T0053 264 -10.43 -12.50 3pfk/3pfk-t53-global T0053 264 -6.19 -11.86 3pfk/3pfk-t53-post T0053 264 -6.19 -11.86 1pauA/1pauA-t53-global T0053 264 -8.98 -11.69 1pauA/1pauA-t53-post T0053 264 -8.98 -11.69 1pauA/1pauA-t53-fssp-global T0053 264 -10.43 -11.14 3pfk/3pfk-t53-const-global T0053 264 -7.62 -10.04 1ak1/t53-1ak1-global 1ak1 308 -8.33 -9.02 1ak1/t53-1ak1-post 1ak1 308 -8.33 -9.02 1fvkA/1fvkA-t53-fssp-global T0053 264 -7.22 -8.09 2aaa/2aaa-t53-vit T0053 264 -7.09 -7.72 1amy/1amy-t53-vit T0053 264 -6.63 -7.25 1div/t53-1div-vit 1div 149 -5.77 -6.73 1pamA/1pamA-t53-vit T0053 264 -5.39 -6.22 1pauA/1pauA-t53-vit T0053 264 -4.12 -6.00 1div/1div-t53-fssp-global T0053 264 -2.67 -5.64 1div/t53-1div-global 1div 149 0.33 -5.60 1div/t53-1div-post 1div 149 0.33 -5.60 1djxB/1djxB-t53-const-global T0053 264 -2.91 -5.20 3pfk/3pfk-t53-vit T0053 264 -4.02 -5.16 1erd/1erd-t53-vit T0053 264 -5.52 -4.88 1erd/t53-1erd-vit 1erd 40 -5.07 -4.88 1erd/1erd-t53-fssp-global T0053 264 -7.88 -4.74 1erd/1erd-t53-global T0053 264 -7.05 -4.57 1erd/1erd-t53-post T0053 264 -7.05 -4.57 1fvkA/1fvkA-t53-vit T0053 264 -4.46 -4.39 1pauA/t53-1pauA-vit 1pauA 140 -3.94 -4.23 1ak1/1ak1-t53-fssp-global T0053 264 0.96 -4.11 1div/1div-t53-vit T0053 264 -3.19 -3.46 3pfk/3pfk-t53-fssp-global T0053 264 0.56 -3.00 1pfkA/t53-1pfkA-global 1pfkA 320 -0.02 -2.70 1pfkA/t53-1pfkA-post 1pfkA 320 -0.02 -2.70 1ak1/t53-1ak1-vit 1ak1 308 -4.71 -2.62 1smd/1smd-t53-vit T0053 264 -1.75 -2.21 1iakB/t53-1iakB-vit 1iakB 185 -1.87 -2.01 t53-1djxB-global: T0053 .....MKKALLVVSFGTSYHDTCEKNIVACERDLAASCPDRDLFRAFTSGMIIRKLRQRDGIDI F R L 1djxB n326a----------------------------------------SFSESRALRLLQESGNGFV T0053 DTPLQALQKLAAQGYQDVAIQSLHIINGDEYEKIVREVQLLRPLFTRLTLGVPLLSSHNDYVQL L G IV 1djxB RHNVSCLSRIYPAGWRTDSSNYSPVEMWNGGCQIV----------------------------- T0053 MQALRQQMPSLRQTEKVVFMGHGASHHAFAAYACLDHMMTAQRFPARVGAVESYPEVDILIDSL AL Q P PE D 1djxB --ALNFQTPG--------------------------------------------PEMDVYLGCF T0053 RDEGVTGVHLMPLMLVAGDHAINDMASDDGDSWKMRFNAAGIPATPWLSGLGENPAIRAMFVAH D G G L P L N A G W I I 1djxB QDNGGCGYVLKPAFLRDPNTTFNSRALTQGPWWRPERLRVRI--------------ISGQQLPK T0053 LHQALNMAVEEAA-...... N V 1djxB VNKNKNSIVDPKV-i100d. The part in the middle PEVD...GDSWK is also found by 1djxB-constr-t98 and 1djxB-fssp-global. The fssp-global alignment extends the alignment back quite a ways: T0053 m40frAFTSGMIIRKLRQRDGIDID--------------------------------------- I D 1djxB .....NKMNFKELKDFLKELNIQVDDGYARKIFRECDHSQTDSLEDEEIETFYKMLTQRAEIDR T0053 ---------------------------------------------------------------- 1djxB AFEEAAGSAETLSVERLVTFLQHQQREEEAGPALALSLIERYEPSETAKAQRQMTKDGFLMYLL T0053 -------------------------------------------TPLQALQKLAAQGYQDVAI-- A G 1djxB SADGNAFSLAHRRVYQDMDQPLSHYLVSSSHNTYLLEDQLTGPSSTEAYIRALCKGCRCLELDC T0053 ----------------------------------------QSLHIINGDEYEKIVREVQLLRPL L N E LR 1djxB WDGPNQEPIIYHGYTFTSKILFCDVLRAIRDYAFKASPYPVILSLENHCSLEQQRVMARHLRAI T0053 F--------------------------------------------------------------- 1djxB LGPILLDQPLDGVTTSLPSPEQLKGKILLKGKKLGGLDKLKLVPELSDMIIYCKSVHFGGFSSP T0053 --TRLTLGVPLLSSHNDYVQLMQALRQQMPslRQTEKVVFMGHGASHHAFAAYACLDHMMTAQR S L Q R A M 1djxB GTSGQAFYEMASFSESRALRLLQESGNGFV..RHNVSCLSRIYPAGWRTDSSNYSPVEMWNGGC T0053 FPARVGAVESYPEVDILIDSLRDEGVTGVHLMPLMLVAGDHAINDMASDDGDSWK--------- PE D D G G L P L N A G W 1djxB QIVALNFQTPGPEMDVYLGCFQDNGGCGYVLKPAFLRDPNTTFNSRALTQGPWWRPERLRVRII T0053 ---------------------------------------------------------------- 1djxB SGQQLPKVNKNKNSIVDPKVIVEIHGVGRDTGSRQTAVITNNGFNPRWDMEFEFEVTVPDLALV T0053 ----------------------MRFNAAGIPATPWLSGLGENPAIRAMFVAHLHQA-ln9aa. G LS G FV Q 1djxB RFMVEDYDSSSKNDFIGQSTIPWNSLKQGYRHVHLLSKNGDQHPSATLFVKISIQD-...... Need to examine these (and the other high-scoring alignments), to see if a metal-binding site is aligned to. Wed Aug 19 12:07:56 PDT 1998 Kevin Karplus The PEVD...GDSW alignment is NOT to either of the metal-binding sites in 1djxB, but is to the connection between two domains. I modified the 1ak1-t53-global alignment to get 1ak1-t53-hand1, but I'm not completely conviced by it. It would be nice if the PDB file contained the metal ion, so I could check the binding pocket for conservation. The 1fvkA-t53-global alignment has good residue conservation, and the deletion of a pair of helices leaves a very small gap in 3D, but there is considerable disagreement between the predicted and template secondary structures. Realigning it by hand (1fvkA-t53-hand1) creates an alignment with very good residue conservation and gap structure, and reasonable secondary structure match. 20 August 1998 Christian The active site for 1fvkA is around Cys30, His32, Cys33. Cobalt, like most/all of the other positive ions, usually has cysteine and histidine ligands. I looked to see where these residues occur in 3D using 1fvkA-t53-hand1 and was encouraged to find that they occur in two fairly tight groups. By bringing the last chunk of 1fvkA that t53 was aligned to back a few residues, I loose an identity but find that three histidines (the two in "AHLHQ" and "DHA" near the very end of the alignment) cluster together tightly. Note that the N and C termini of 1fvkA are very near to each other and that we have not aligned the first 50 residues of t53 to the structure. Also note that there are 3 cysteines and 1 histidine in this stretch of residues. It's conceivable that those 50 residues form a structure that acts in concert with the cluster of three histidines. The other cluster occurs fairly close to the 1fvkA active site. The alignment conserves His32. I made a 1 or 2 residue shift in the alignment to get what may be better placement. My alignment is 1fvkA-t53.cbarrett1.a2m. For me to "improve" it would involve maybe a 1-residue shift here or there, which I don't think is really that important and would involve some second-guessing that would more likely than not be wrong. 21 August 1998 Christian I cut the first 60 residues from t53 in an effort to try to find a clearer structural signal for it. See directory t53-first/README. The top hits for this piece are to a metal binding protein, which if tacked onto the front of t53/1hvkA/1fvkA-t53.cbarrett1.a2m, looks like it would make a nice metal binding pocket. Fri Aug 21 17:00:43 PDT 1998 Kevin Karplus Christian's reasoning about 1fvkA looks good to me. I did modify the alignment slightly around VTGVHLM (about 140 residues from the beginning) in order to bring another histidine up to the cluster. I don't much care for the 1ctl alignment he proposed for t53-first, but the t53-first-1ash.cbarrett1 alignment (up to RKLR) looks good---it could easily stack the helices so that the histidine/cisteine rich areas at the possible metal-binding site in 1fvkA could be supplemented with the Cys at the turn between the helices of 1ash.