Mon Aug 17 09:56:38 PDT 1998 wu-blast finds no obvious homologs (best 2bca=2bcb=1cdn=1clb 0.09531018) double-blast finds possible: TARGET HMM SCORE INTERE FINALE intermediate T0083 1lmb4 -4.07454 1.2e-16 0.017 gi|2982931_8:149 T0083 1lrp -4.07454 1.2e-16 0.017 gi|2982931_8:149 T0083 1lmb3 -4.07454 1.2e-16 0.017 gi|2982931_8:149 There are no PDB chains in t83.t98_6.a2m. Top hits with t83.t98_6 are 1lliA -14.180 1lmb3 1lliB -14.180 1lmb3 1lmb3 -14.070 1lmb3 1lmb4 -14.070 1lmb3 1lrp -14.070 1lmb3 1ego -6.980 1grx 1egr -6.980 1grx 1grx -6.910 1grx 1yrnA -6.560 1yrnA 1neq -6.220 1neq 1ner -6.220 1neq 1akhA -6.210 ? 1ain -6.020 1avc The top hits (except 1ain) are all repressor-like DNA-binding domains. This is strange, since the protein is not DNA-binding, but an enzyme catalyzing CC -!- CATALYTIC ACTIVITY: CYANATE + BICARBONATE = CO(2) + CARBAMATE. CC -!- SUBUNIT: THIS ENZYME MAY BE A COMPLEX OF FOUR OR FIVE SUBUNITS, CC WHICH ARE DIMERS COVALENTLY LINKED THROUGH CYS-83. (The unit is actually a homodecamer, not just a dimer.) The top hits with the library models are 1lliA -6.330 1lmb3 1lmb3 -6.110 1lmb3 1nif_1 -5.240 1nif 1cc5 -4.960 1cc5 2or1L -4.570 1r69 1r69 -4.240 1r69 1hgxA -3.980 1hgxA Summing both ways moves 1lliA and 1lmb3 to the top. t83 1lliA -20.51 t83 1lmb3 -20.18 t83 1lliB -14.180 t83 1lmb4 -14.070 t83 1lrp -14.070 t83 2or1L -9.62 t83 1r69 -9.29 t83 1adr -8.4 The fssp alignments choose a different top hit: 1nulA -8.630 1hgxA -6.780 (Z=12.8 1nulA) 1roo -6.740 1dbrA -6.580 (Z=19.4 1hgxA, Z=8.9 1nulA) 1ecfB -5.980 (Z=8.3 1nulA, Z=8.5 1hgxA, Z=7.7 1dbrA) 2end -5.330 The target98-mixed alignments get 2yhx -3.540 2yhx 1uae -3.530 1uae 1hgxA -3.420 1hgxA 1lylA -3.120 1lylA 1rmi -3.070 1rmi 1kr[st] -3.010 1krs 1adr -2.890 1r69 The alignment of residues 5-68 to 1lmb3 (1lmb3/t83-1lmb3-hand1.a2m) is excellent (17 conserved residues with gapless alignment, good 2ry match). We may need to split off the remaining 88 for separate prediction. PRODOM doesn't see this as a 2-domain protein, and gives only the 3 homologs as sharing the domain: cyns_ecoli, cyns_synp7, cyns_syny3. The 1lliA alignments are essentially identical to the 1lmb3 ones, and I like the slightly higher residue ID on 1lmb3. The 1nulA-t83-fssp-global alignment doesn't look great, but it is possible to get an adequate alignment (t83-1nulA-hand1). Looking at just the tail raises wu-blast's opinion of 2bca-2bcb=1cdn=1clb to -1.31, but double-blast still finds nothing. t83tail.t98_6 finds nothing very strong: 1bgw -4.730 1pytD -4.520 1agj[AB] -4.470 1exfA -4.470 1mhl[CD] -4.010 1myp[CD] -4.010 1xas -3.840 1hrb -3.700 1hc[123456y] -3.480 1bmf[ABC] -3.420 1cow[ABC] -3.420 1efr[ABC] -3.420 The target98 library models don't find t83tail great either: 1nif_1 -5.680 2end -4.490 1nulA -4.440 1hmpA -3.750 1opr -3.360 1tap -2.900 1tcp -2.900 1urnA -2.850 1sxl -2.560 1kde -2.420 1r69 -2.300 Summing both ways moves 2end up: 2end -7.64 1nif_1 -5.680 1opr -5.42 1t[ac]p -4.75 1tcp 1bgw -4.730 1hcd -4.64 1hce 1pytD -4.520 5ptp 1agj[AB] -4.470 1agjA 1exfA -4.470 1agjA 1nulA -4.440 1mhl[CD] -4.010 1mhlC 1myp[CD] -4.010 1mhlC 1nif -3.9 With the fssp alignments, the best for t83tail are 1nulA -9.300 1hgxA -7.700 lists 1nulA with Z=12.8 over 126 aa with 21 %IDE 2end -5.490 1poiB -4.880 lists 1hgxA with Z=2.6 over 94 aa with 4 %IDE 1opr -4.440 lists 1hgxA with Z=6.3 over 114 aa with 11% IDE lists 1nulA with Z=6.2 over 117 aa with 10% IDE 1gln -3.450 1bgc -2.950 With target98 and viterbi scoring, the best are 1hmpA t83tail -7.580 similar to 1hgxA,1nulA through 1dbrA FSSP 2end t83tail -7.290 1opr t83tail -6.490 similar to 1hgxA,1nulA as mentioned above 1hgxA t83tail -6.360 * 1nif_1 t83tail -6.110 1nulA t83tail -6.100 * 1dbrA t83tail -5.830 * 2u1a t83tail -5.470 Note, one of the close homologs of t83 is SINR_BACLI---a very close homolog of t64! This homology is only evident in the first part of the sequence---the part that looks like a DNA-binding site. In the t83tail.remote_4, some PDB files come in: 1ppt and 5icb (and 5icb homologs). 25 August 1998 Christian T0083 forms a dimer that is stablized by an interchain disulfide at position 83 in the Swissprot sequence, "gCi". Five of these subunits then form the homodecamer. But while the disulfide occurs in the dimer, it does not in the decamer. Substituting for the one H & C residues still gives an active enzyme, however H113N, H113Y and C83G were unstable. Mutating C to N, V, L, Y and S was preserved stability. There is a clustering around the structures 1nulA, 1hmpA, 1hgxA, 1opr, and 1dbrA. Circular dichroism shows that T83 has significant amounts of both alpha and beta structure. If the 2ary structure prediction for T83 is to be believed, then it is probably an alpha+beta protein. The 1lmb3 alignment is promising because the alignment has lots of residue identity with few gaps and matches the helix 2ary structure prediction for the 1st half of t83 fairly well. It's downside is that it is a DNA-binding protein with no beta. 1nulA is an alpha/beta protein. The cysteine doesn't occur in a place where t83 would dimerize as 1nul does, and indeed the alignment buries it. 1hgxA, though the alignment 1hgxA-t83.cbarrett1.a2m has 29 indentities with 5 indel positions, buries the cysteine. 1opr-t83.cbarrett1.a2m is another remote possibility, but it probably would need a bit more fiddling. 1ecfB-t83.cbarrett1.a2m aligns to one domain of chain B. The alignment minimizes the protein but preserves the core with interesting enough conservation that makes lack of further information frustrating. The cysteine and histidine residues, though, are probably a little too buried. 1ain is an all alpha protein for which the alignment is not very compact. I feel that this is probably a new fold. If we are going to submit, I think that 1lmb3 or one of 1nulA/1hgxA/1opr. 29 August 1998 Kevin Karplus I favor a 2-domain submission, with 1lmb3 for the first domain. Here are the top-scoring alignments for the second domain: 1ppt/1ppt-t83tail-fssp-global t83tail 92 -10.32 -6.98 1nif/t83tail-1nif-global 1nif 333 -7.81 -6.06 1nif/t83tail-1nif-post 1nif 333 -7.81 -6.06 1dbrA/1dbrA-t83tail-vit t83tail 92 -5.22 -5.91 1poiB/t83tail-1poiB-global 1poiB 260 -5.12 -5.88 1poiB/t83tail-1poiB-post 1poiB 260 -5.12 -5.88 1hgxA/1hgxA-t83tail-vit t83tail 92 -5.65 -5.64 1nulA/1nulA-t83tail-fssp-global t83tail 92 -2.35 -5.54 1opr/1opr-t83tail-vit t83tail 92 -4.30 -5.39 1opr/t83tail-1opr-global 1opr 213 -6.33 -5.19 1opr/t83tail-1opr-post 1opr 213 -6.33 -5.19 1hgxA/1hgxA-t83tail-fssp-global t83tail 92 0.74 -5.00 1poiB/t83tail-1poiB-vit 1poiB 260 -3.16 -3.98 2end/t83tail-2end-vit 2end 137 -4.59 -3.53 1poiB/1poiB-t83tail-vit t83tail 92 -4.41 -3.24 1hgxA/t83tail-1hgxA-global 1hgxA 164 -2.51 -2.99 1hgxA/t83tail-1hgxA-post 1hgxA 164 -2.51 -2.99 1ppt/t83tail-1ppt-vit 1ppt 36 -3.93 -2.53 2end/t83tail-2end-global 2end 137 -2.33 -2.37 2end/t83tail-2end-post 2end 137 -2.33 -2.37 1nif/t83tail-1nif-vit 1nif 333 -2.48 -2.09 1dbrA/t83tail-1dbrA-vit 1dbrA 227 -1.77 -2.01 5icb-t83-fssp-globa T0083 156 -5.85 -0.99 t83-5icb-vit.pw 5icb 75 -4.09 -1.55 30 August 1998 Christian 1ppt is a 36-residue hormone(peptide). 1dbrA-t83tail.cbarrett1.a2m is an alignment with 17 identities & 4 deletions in a stretch of 64 amino acids. I don't really see any reason that it couldn't pack on top of 1lmb3. 1poiB is probably too distended. 5icb-t83tail-fssp-global.pw.a2m looks to be a pretty good hit to an EFhand motif. This really only takes care of the last 30 residues of t83tail and is all alpha. I don't know if cyanase requires calcium, but if so this is probably where it would bind it. 2end is just a fragment. Current thinking is new fold. If we submit 1lmb3 and a second structure, that second structure must have some beta in it as we know that t83 is an alpha, beta mix and 1lmb3 is all alpha. Fri Sep 4 12:32:51 PDT 1998 Kevin Karplus Rechecking the joint models for possible hits to t83tail (and dropping the post models), the best-scoring ones are 2end/2end-t83tail-vit t83tail 92 -6.85 -7.31 1ppt/1ppt-t83tail-fssp-global t83tail 92 -10.32 -6.98 1nulA/1nulA-t83tail-vit t83tail 92 -5.52 -6.15 1nif/t83tail-1nif-global 1nif 333 -7.81 -6.06 1dbrA/1dbrA-t83tail-vit t83tail 92 -5.22 -5.91 1poiB/t83tail-1poiB-global 1poiB 260 -5.12 -5.88 1nif/1nif-t83tail-vit t83tail 92 -5.82 -5.78 1hgxA/1hgxA-t83tail-vit t83tail 92 -5.65 -5.64 1nulA/1nulA-t83tail-fssp-global t83tail 92 -2.35 -5.54 1nulA/1nulA-t83tail-global t83tail 92 -3.21 -5.44 1opr/1opr-t83tail-vit t83tail 92 -4.30 -5.39 1opr/t83tail-1opr-global 1opr 213 -6.33 -5.19 1hgxA/1hgxA-t83tail-fssp-global t83tail 92 0.74 -5.00 1poiB/t83tail-1poiB-vit 1poiB 260 -3.16 -3.98 2end/t83tail-2end-vit 2end 137 -4.59 -3.53 1poiB/1poiB-t83tail-vit t83tail 92 -4.41 -3.24 2end-t83tail-vit gets 6 identical residues out of 15, and the secondary match is not great. Nothing significant here. 1ppt-t83tail-fssp-global gets 9 identical residues out of 5+16=21 aligned, nicely matching secondary structure, but the alignment is so short as to be almost useless (other than confirming that MYRFYEMLQVYGTTL is most likely a helix). 1nulA-t83tail-vit gets 7 identical residues out of 22 aligned--not enough to mean much. 1nulA-t83tail-fssp-global attempts to extend it, but the result doesn't seem much better than a random alignment. 1nulA-t83tail-global does worse. t83tail-1nif-global gets 6+8=14 identical residues out of 29+27=56 aligned. The secondary structure matches well for the second part (a beta hairpin at DVKKVADPEGGERAVITL), but the overall alignment is not compact. 1nif-t83tail-vit gets 8 identical residues out of 20 (the beta hairpin again). 1dbrA-t83tail-vit gets 7 identical residues out of 22 aligned---this looke like the same piece 1nulA aligned. t83tail-1poiB-global gets 6+10=16 identical residues out of 30+39=69. Although the residue conservation is pretty good, the secondary structure match is poor and the final helix is not close to the rest in 3D. 1hgxA-t83tail-vit gets 8 identical residues out of 22 (same residues as 1nulA and 1dbrA---not too surprising, since all three of these have similar sequences and structures here [see 1dbrA.fssp.a2m]). This is a binding site for all three, though 1hgxA has a phosphate, 1dbrA has magnesium, and 1nulA has magnesium oxide (Mg O5) and sulfate. It may be worth reporting the sequence IDDRIPTDPTMYRFYEMLQVYGTTL as a probable binding site. The Ts of IPTDPTMY to be the residues holding the phosphate in 1hgxA. The 1opr alignment picks out the same region (binding magnesium and 3 phosphates), with 8 identical residues out of 28. Looking at the best alignments for the whole target again: 1lliA/t83-1lliA-global 1lliA 89 -13.01 -16.17 1lmb3/t83-1lmb3-global 1lmb3 87 -12.73 -15.72 1lliA/t83-1lliA-vit 1lliA 89 -14.91 -14.43 1lmb3/t83-1lmb3-vit 1lmb3 87 -14.67 -13.89 2end/2end-t83-fsspt98-global T0083 156 -8.40 -11.58 2end/2end-t83-global T0083 156 -7.54 -10.82 2end/2end-t83-const-global T0083 156 -7.95 -10.52 1r69/1r69-t83-global T0083 156 -11.51 -10.45 1lmb3/1lmb3-t83-global T0083 156 -11.95 -9.36 1nulA/1nulA-t83-fssp-global T0083 156 -8.34 -9.03 1lmb3/1lmb3-t83-const-global T0083 156 -10.93 -9.02 1lmb3/1lmb3-t83-fsspt98-global T0083 156 -10.54 -8.69 1ain/t83-1ain-global 1ain 314 -10.92 -8.17 1opr/1opr-t83-fssp-global T0083 156 -4.74 -8.04 1lmb3/1lmb3-t83-vit T0083 156 -7.81 -7.63 1hgxA/1hgxA-t83-fssp-global T0083 156 -7.67 -7.34 2end/2end-t83-fssp-global T0083 156 -6.69 -7.27 1ppt/1ppt-t83-fssp-global T0083 156 -10.34 -6.91 1r69/1r69-t83-vit T0083 156 -7.26 -6.78 1hgxA/1hgxA-t83-global T0083 156 -3.91 -6.75 1lmb3/1lmb3-t83-fssp-global T0083 156 -10.32 -6.62 2u1a/2u1a-t83-global T0083 156 -3.10 -6.38 1opr/1opr-t83-global T0083 156 -2.21 -5.89 2end/2end-t83-vit T0083 156 -6.33 -5.88 1roo/1roo-t83-vit T0083 156 -3.66 -5.80 1nif/1nif-t83-vit T0083 156 -5.29 -5.78 1r69/t83-1r69-global 1r69 63 -2.64 -5.77 1hgxA/1hgxA-t83-vit T0083 156 -5.12 -5.64 2u1a/2u1a-t83-vit T0083 156 -4.37 -5.59 1dbrA/1dbrA-t83-global T0083 156 0.93 -5.47 The 1lliA and 1lmb3 alignments are still pretty good. The t83-1lmb3-hand1 alignment trims a little off the end of the global alignment and closes a gap in the first helix (getting 17 identical residues out of 68), and t83-1lmb3-hand2 aligns the end helix with a little more residue identity than the global alignment (getting 24 identical residues out of 11+60+6+8=85). The 2end-t83-fsspt98-global alignment has decent residue conservation, but lots of unbridgeable gaps. The 2end-t83-global and 2end-t83-const-global are similar, but the 2end-t83-const-global seems to have the easiest gaps to bridge. It comes in three pieces with 6+6+6=18 identical residues out of 29+48+24=91. I'm not very impressed with this alignment. The 1r69-t83-global alignment is much less convincing than the 1lmb3 alignments. If I'm going to predict a repressor structure, I'll stick 1ith 1lmb3. The 1nulA-t83-fssp-global alignment is a not-very-convincing extension of the short active-site alignment found by t83tail. The 1ain/t83-1ain-global alignment is interesting, with 26 identical residues out of 129 aligned (with 3 inserts). The second chunk DGTGLA...DEDSILLL looks particularly good. It is a problem that 1ain is all-alpha, when we know that T0083 is not. This annexin has the calcium/phospholipid-binding repeat that is characteristic of annexins, and one of the binding sites (around GAKLDL) is in the conserved part of the alignment. Another one is right at the beginning of the conserved part (around DGTGLA). By fussing with the alignment, and unaligning the part at the end that I expect to be strands, I get an alignment with 7+12+5=24 identical residues out of 90 aligned (t83-1ain-hand1.a2m). This is about as good as the 1lmb3 alignment. Note: 1ain is a c-alpha-only pdb file, so I used 1aeiD instead to look at the secondary structure (as far as I can tell from 1ain.target98-pdb.tree, 1aeiD is the closest other PDB file). Unfortunately, all it really gives us is a strong prediction of a single helix: FLAEAFVTAAL, though I can modify the alignment to get get more conservation, it never gets as good as 1ain. I think I'll submit two models, one for 1lmb3 and one for 1ain, both with the caveat that functionally neither makes much sense. I'll also mention the possibility of IDDRIPTDPTMYRFYEMLQVYGTTL as a possible binding site (from alignments with 1nulA, 1dbrA, 1hgxA, and 1opr). Fri Sep 4 16:09:36 PDT 1998 The r83.remote_4 model finds some very different hits (1gukA and 1gukB). 1gukA/t83-1gukA-vit gets 16 identical residues on a gapless alignment of 53 residues. By adding to the beginning, I can get up to 23 indentical residues, with 1 2-residue gap and one 7-residue insert (t83-1gukA-hand1). This looks about as good as the 1lmb3 and 1ain alignments. The one advantage 1gukA has is that it is a metabolic enzyme.