12 May 2000 Kevin Karplus No good hits. Top two with target models are 1agnA and 2ohxA, which are 70% identical (2ohxA is FSSP representative). 1agnA is human sigma alcohol dehydrogenase (sigma adh) 2ohxA is Alcohol dehydrogenase 2.33.1, 3.2.1 What is a PPase anyway? The homologs seem to be labeled "exopolyphosphatase" or just "hypothetical protein". Should we be looking for a phosphatase domain? Perhaps one of Saira's models would help here. There are a lot of homologs found by t2k (around 40), with what looks like good alignment, so the 2ry structure predictions should be pretty good. 15 May 2000 Kevin Karplus Saira suggests looking at the RecJ family of single-strand DNAses which may match both T87 and the DNA helicase 1qhgB (FSSP rep=1pjr). See saira/README. Since I have a separate DNA-repair project, I'll build the RecJ alignments in ~/pce/protein-predict/DNA-repair/recj, starting from recj_ecoli. Note: the RecJ--1qhgB similarity is not close enough to find with blast, unless thresholds are set very loose. Double-blast finds 1gotG from RecJ---1gotG is a transducin gamma subunit. It's fssp representative is now 1tbgE, which has identical residues, but is 10 residues longer. I could try forcing a T87-1pjr match, though 1pjr is longer (1qhgB is a subunit of 1pjr, and they agree to 0.2 Angstroms over all 261 residues). I could also do a basic-setup for 1qhgB, though we won't have an fssp alignment for it (we could use the 1pjr.fssp.a2m file). 16 May 2000 Kevin Karplus I did alignments and searches for RecJ using t99, and found a few possible hits, but none that worked in both directions (template model and target model). The best hits were 1ecl target model (1ecl is fssp template) 5.10.1 1ubpC template model (1ubpC is fssp template) 2.86.1,3.1.8 1cy[0124678]A target model (1ecl is fssp template) 1zymA template model (1zymA is fssp template) 1.60.11,3.7.1 1dik_2 template model (SCOP domain) (1dik is fssp template) 3.7.1 2fok[AB] target model (2fokA is fssp template) 1.4.4,3.47.1 All the target hits (1ecl, 1cy*A, 2fok[AB]) are similar structures. The 1ecl and 1cy*A hits are DNA topoisomerase i, and 2fok[AB] is foki restriction endonuclease fragment. The 1ubpC and 1dik hits are similar (assuming that the FSSP similarity is to the 1dik_2 domain). 1ubpC is urease (urea aminohydrolase) and 1dik is pyruvate phosphate dikinase. The 1zymA hit is for enzyme i fragment of the phosphotransferase system. It has considerable similarlity to 1dik and very weak similarity to 1ecl. Is the phosphotransferase an interesting similarity to T0087? Tue Jun 6 13:35:29 PDT 2000 Redid 2ry prediction with new neural net. 8 June 2000 Looked at summary of CAFASP servers. There seems to be some strong consensus on two hits: 3.2.1 (24 hits) NAD(P)-binding Rossmann-fold domains 4.140.1 (14 hits) Lactate & malate dehydrogenases, C-terminal domain Note: the alcohol dehydrogenases that were the top target-model hits have 3.2.1 domains. Lactate and malate dehydrogenases have 3.2.1 and 4.140.1 domains. The 1pjr (1qhgB) suggestion is superfamily P-loop containing nucleotide triphosphate hydrolases (3.31.1). Main question: do we go with the Rossmann fold, like almost everyone else, or do we try something more speculative? Mon Jun 26 09:48:54 PDT 2000 Remade 2ry predictions Mon Jun 26 09:52:02 PDT 2000 Remade 2ry predictions Sat Aug 26 15:25:26 PDT 2000 Kevin Karplus Remade 2track predictions Top hits (2track) are % Sequence ID Length Simple Reverse E-value SCOP 2yhx 457 -40.07 -21.59 2.6e-06 9.8.1.1.8 1tgoA 773 -41.23 -20.68 7.1e-06 3.50.3,5.8.1 1mrj 247 -31.14 -15.26 1.0e-03 4.143.1 1qh8A 478 -37.22 -15.15 1.0e-03 3.81.1 1bykA 255 -35.78 -15.08 1.0e-03 3.88.1 1abrA 251 -29.19 -14.53 2.8e-03 4.143.1 1cl2A 395 -32.26 -14.38 2.8e-03 3.62.1 1qguA 478 -37.08 -14.24 2.8e-03 3.81.1 1tdj 514 -34.74 -14.05 2.8e-03 3.73.1,4.48.17 Nothing looks particularly helpful here---no further confirmation of a previous hit, just a lot of new weak predictions. Fri Sep 1 10:11:05 PDT 2000 Kevin Karplus Looked at the SWISSPROT homologs found in the t2k alignment. PPAC_STRMU is top hit DE PROBABLE MANGANESE-DEPENDENT INORGANIC PYROPHOSPHATASE (EC 3.6.1.1) DE (PYROPHOSPHATE PHOSPHO-HYDROLASE) (PPASE). CC -!- CATALYTIC ACTIVITY: PYROPHOSPHATE + H(2)O = 2 ORTHOPHOSPHATE. CC -!- COFACTOR: REQUIRES MANGANESE FOR ITS ACTIVITY (BY SIMILARITY). CC -!- SUBCELLULAR LOCATION: CYTOPLASMIC (BY SIMILARITY). CC -!- SIMILARITY: BELONGS TO THE PPASE CLASS C FAMILY. Y371_MYCPN CC -!- SIMILARITY: BELONGS TO THE MGPA / MG371 FAMILY. (In Pfam DHH family) PPAC_BACSU DE MANGANESE-DEPENDENT INORGANIC PYROPHOSPHATASE (EC 3.6.1.1) DE (PYROPHOSPHATE PHOSPHO-HYDROLASE) (PPASE). CC -!- CATALYTIC ACTIVITY: PYROPHOSPHATE + H(2)O = 2 ORTHOPHOSPHATE. CC -!- COFACTOR: REQUIRES MANGANESE FOR ITS ACTIVITY. CC -!- SUBCELLULAR LOCATION: CYTOPLASMIC. CC -!- MASS SPECTROMETRY: MW=34019; METHOD=MALDI. CC -!- SIMILARITY: BELONGS TO THE PPASE CLASS C FAMILY. Y371_MYCGE CC -!- SIMILARITY: BELONGS TO THE MGPA / MG371 FAMILY. (In Pfam DHH family) MGPA_MYCIN DE MGPA PROTEIN. CC -!- SIMILARITY: BELONGS TO THE MGPA / MG371 FAMILY. (In Pfam DHH family) MGPA_MYCGE DE MGPA PROTEIN. CC -!- SIMILARITY: BELONGS TO THE MGPA / MG371 FAMILY. (In Pfam DHH family) PPAC_METJA DE MANGANESE-DEPENDENT INORGANIC PYROPHOSPHATASE (EC 3.6.1.1) DE (PYROPHOSPHATE PHOSPHO-HYDROLASE) (PPASE). CC -!- CATALYTIC ACTIVITY: PYROPHOSPHATE + H(2)O = 2 ORTHOPHOSPHATE. CC -!- COFACTOR: REQUIRES MANGANESE OR COBALT AGAINST FLUORIDE CC INHIBITION. CC -!- MASS SPECTROMETRY: MW=34169; METHOD=ELECTROSPRAY. CC -!- SUBCELLULAR LOCATION: CYTOPLASMIC. CC -!- SIMILARITY: BELONGS TO THE PPASE CLASS C FAMILY. YG33_METJA CC -!- SIMILARITY: SOME, TO M.JANNASCHII MJ0988. (In Pfam DHH family) PPX1_YEAST DE EXOPOLYPHOSPHATASE (EC 3.6.1.11) (METAPHOSPHATASE). CC -!- FUNCTION: DEGRADATION OF INORGANIC POLYPHOSPHATES. CC -!- CATALYTIC ACTIVITY: (POLYPHOSPHATE)(N) + H(2)O = CC (POLYPHOSPHATE)(N-1) + ORTHOPHOSPHATE. CC -!- SIMILARITY: BELONGS TO THE PPASE CLASS C FAMILY. YYBT_BACSU DE HYPOTHETICAL 74.3 KDA PROTEIN IN RPLI-COTF INTERGENIC REGION. (In Pfam DHH family) Y988_METJA DE HYPOTHETICAL PROTEIN MJ0988. CC -!- SIMILARITY: SOME, TO M.JANNASCHII MJ1633. (In Pfam DHH family) Pfam family PF01368;DHH family includes RecJ. The family is predicted to have phosphoesterase function. Interestingly, the t99 and t2k alignments for T0087 and RecJ don't seem to overlap, though Pfam (which is usually more specific) claims both have the DHH domain. Note: the PPAC proteins themselves are not in the Pfam DHH family---just the close hits we found for MGPA... The DHH domain is often paired with the DHHA1 domain (PF02272), and is for both the RECJ and MGPA proteins. Doing searches with both DHH and DHHA family models (after retuning seed alignments to make rare sequences use insert states. Blast hits for DHH look very weak, and do not contain the distinctive DHH sequences that characterizes the domain. Blast hits for DHHA have stronger signals, but may not be recognizing motif (top hit is a designed helix). The others may just be finding just the GGxxxA signal in a helix. 01 Sep 2000 Rachel Karchin best scoring alignments are: 1mrj/T0087-1mrj-2track-global.pw.dist:1mrj 247 -17.81 -27.54 5.6e-12 1bykA/T0087-1bykA-2track-global.pw.dist:1bykA 255 -28.24 -23.44 3.1e-10 1qh8A/T0087-1qh8A-2track-global.pw.dist:1qh8A 478 -9.81 -22.92 8.4e-10 2yhx/T0087-2yhx-2track-local.pw.dist:2yhx 457 -40.07 -21.59 2.3e-09 1tgoA/T0087-1tgoA-2track-local.pw.dist:1tgoA 773 -41.23 -20.68 6.2e-09 2yhx/T0087-2yhx-2track-global.pw.dist:2yhx 457 -19.78 -19.14 1.7e-08 1bykA/T0087-1bykA-2track-local.pw.dist:1bykA 255 -35.78 -15.08 9.2e-07 1mrj/T0087-1mrj-2track-local.pw.dist:1mrj 247 -31.14 -15.26 9.2e-07 1qh8A/T0087-1qh8A-2track-local.pw.dist:1qh8A 478 -37.22 -15.15 9.2e-07 2yhx/2yhx-T0087-fssp-global.pw.dist:T0087 310 -2.68 -10.90 1.4e-04 2ohxA/2ohxA-T0087-local.pw.dist:T0087 310 -18.03 -7.20 2.2e-03 2ohxA/2ohxA-T0087-fssp-global.pw.dist:T0087 310 -20.81 -6.80 3.3e-03 2ohxA/2ohxA-T0087-vit.pw.dist:T0087 310 -9.55 -5.17 1.7e-02 2ohxA/2ohxA-T0087-global.pw.dist:T0087 310 -110.62 -4.96 2.1e-02 1tbgE/1tbgE-T0087-global.pw.dist:T0087 310 -67.83 -4.65 2.8e-02 1tbgE/1tbgE-T0087-fssp-global.pw.dist:T0087 310 -96.76 -3.93 5.8e-02 1pjr/1pjr-T0087-fssp-global.pw.dist:T0087 310 -58.32 -3.30 1.1e-01 2ohxA/T0087-2ohxA-global.pw.dist:2ohxA 374 -137.92 -3.23 1.1e-01 1pjr/T0087-1pjr-2track-local.pw.dist:1pjr 623 -26.74 -3.71 1.4e-01 1qhgB/T0087-1qhgB-2track-local.pw.dist:1qhgB 261 -17.99 -3.04 1.4e-01 2ohxA/T0087-2ohxA-vit.pw.dist:2ohxA 374 -4.14 -2.79 1.7e-01 1pjr/1pjr-T0087-vit.pw.dist:T0087 310 -6.22 -2.55 2.2e-01 1qhgB/1qhgB-T0087-vit.pw.dist:T0087 310 -6.43 -2.47 2.3e-01 1pjr/1pjr-T0087-local.pw.dist:T0087 310 -14.77 -2.40 2.5e-01 1qh8A/1qh8A-T0087-vit.pw.dist:T0087 310 -5.84 -2.96 3.6e-01 1qhgB/T0087-1qhgB-local.pw.dist:1qhgB 261 -13.61 -1.96 3.7e-01 Comments: T0087-1mrj-2track-global pretty good conservation and 2ry structure match. Unaligned regions are a loop, a a strand and most of a second strand. This looks believable. Unfortunately no conservation around the active site. [Kevin: DHH, DHHA split looks plausible. D of DHH preserved.] T0087-1bykA-2track-global Good 2ry match. Pretty good conservation Deletions are in reasonable positions, at the end of a sheet, end of a helix and on loops that could conceivably be shorter. Three unaligned sheets. [Kevin: Conserved DHH of DHH domain is in insert. domain split doesn't seem to match DHH, DHHA organization] Sun Sep 3 10:41:13 PDT 2000 Kevin Karplus On Friday I ran searches for the pfam DHH and DHHA families, after retuning with t2k to remove long inserts. For DHH (pf01368), blast gets weak hits 1huuC 1.56.1 1tle 7.3.11 double-blast gets hits 1cby 4.86.1 1e19[AB] ? 1b8wA 7.9.1 1uroA 3.1.21 target model gets weak hits 1c07A ? 1a69[ABC] 3.51.1 1ecp[ABCDEF] 3.51.1 template model hits are stronger: 1zymA -14.22 0.0058 1.60.11,3.7.1 1cec -13.47 0.0123 3.1.7 1ubpC -13.19 0.0163 2.86.1,3.1.8 1oasA -12.53 0.0315 3.73.1 1gca -12.40 0.0359 3.88.1 5tmpA -12.35 0.0377 3.31.1 1tph1 -12.10 0.0485 3.1.1 1ceo -12.02 0.0525 3.1.7 2track hits are weak (lamba is wrong for 2-track, so Evalue too big) % Sequence ID Length Simple Reverse E-value SCOP 1moq 368 -32.82 -15.61 1.0e-03 3.74.1 1tyfA 193 -33.17 -15.50 1.0e-03 3.11.1 1bxkA 355 -33.11 -13.50 7.7e-03 3.2.1 1dmr 823 -27.64 -12.43 2.1e-02 2.49.2,3.75.1 1vsd 152 -25.16 -11.21 5.7e-02 3.50.3 1bvh 157 -24.99 -11.08 5.7e-02 3.39.1 1cl2A 395 -34.41 -11.06 5.7e-02 3.62.1 There doesn't seem to be any consistent prediction here. Weak prediction for 3.1 (1cec,1ubpC,1tph1,1ceo templates) For DHHA (pf02272), blast gets modest hits: 4hb1 11.7.1 1qfnA 3.42.1 1grx 3.42.1 1ego 3.42.1 1egr 3.42.1 1qd6[CD] 6.4.2 1qd5A 6.4.2 1sro 2.38.4 double-blast gets hits: 1c53 1.3.1 4hb1 11.7.1 1xzi 3.16.8 1c2y[ABCDEFGHIJKLMNOPQRST] ? 1ebd[AB] 3.3.1,3.3.1,4.72.1 target model gets NO hits template models get very weak hits 1nsj -8.55 1.689 3.1.2 1dar -7.63 4.237 2.41.3,3.31.1,4.12.4,4.48.11 1pkp -7.47 4.972 4.12.1,4.42.1 1drqA -7.35 5.605 4.91.2 1bjt -7.22 6.383 5.11.1 1drmA -7.11 7.124 4.91.2 2ohxA -7.05 7.565 2.33.1,3.2.1 2bbvA -6.97 8.194 2.9.1 1gpmA -6.84 9.330 3.19.2,3.63.1,4.44.2 1lvl_3 -6.84 9.330 4.72.1 8abp -6.84 9.330 3.88.1 1ab4 -6.75 10.20 5.11.1 1aipE -6.70 10.73 2.41.3,2.42.1,3.31.1 1a2vA -6.64 11.39 2.29.2,4.15.2,4.15.2 1el6B -6.61 11.74 ? 1qb2A -6.58 12.09 1.38.1 1dulA -6.45 13.77 ? 1gpmA_2 -6.40 14.47 3.63.1 2cuaA -6.37 14.91 2.5.1 1fipA -6.32 15.68 1.100.1 2-track gets weak hits % Sequence ID Length Simple Reverse E-value X count 1lvl 458 -19.87 -7.79 3.1e+00 3.3.1,3.3.1,4.72.1 1shcA 195 -17.91 -7.24 3.1e+00 2.52.1 1d1rA 116 -20.54 -6.89 8.5e+00 4.51.1 1quqA 129 -18.71 -6.58 8.5e+00 2.38.4 2rgf 97 -21.88 -6.50 8.5e+00 4.13.3 3ladA 476 -18.61 -6.37 8.5e+00 3.3.1,3.3.1,4.72.1 1qh4A 380 -17.84 -6.28 8.5e+00 1.79.1,4.107.1 1awsA 164 -16.87 -6.06 8.5e+00 2.58.1 4blmA 265 -18.75 -5.87 2.3e+01 5.3.1 1bco 327 -16.11 -5.78 2.3e+01 2.45.1,3.50.3 1bg0 357 -16.97 -5.77 2.3e+01 1.79.1,4.107.1 1c53 79 -15.36 -5.57 2.3e+01 1.3.1 1vcbA 118 -15.41 -5.51 2.3e+01 4.13.2 Weakly corroborated hits: 1ebd[AB],1lvl,3ladA (double-blast, template, 2track) 4.72.1 1c53 (double-blast and 2 track) 4hb1 (blast and double blast) 1sro, 1quqA (blast and 2track) 1dar (templates, with 1aipE and weakly 1pkp) Hits shared with DHH: 1gca,8abp 3.88.1 5tmpA,1dar,1aipE 3.31.1 1cec,1ubpC,1tph1,1ceo,1nsj 3.1* 1bxkA,2ohxA 3.2.1 1vsd,1bg0 3.50.3 Sun Sep 3 11:54:11 PDT 2000 Kevin Karplus Top alignments now: 1mrj/T0087-1mrj-2track-global 1mrj 247 -17.81 -27.54 5.6e-12 1bykA/T0087-1bykA-2track-global 1bykA 255 -28.24 -23.44 3.1e-10 1qh8A/T0087-1qh8A-2track-global 1qh8A 478 -9.81 -22.92 8.4e-10 2yhx/T0087-2yhx-2track-local 2yhx 457 -40.07 -21.59 2.3e-09 8abp/T0087-8abp-2track-global 8abp 305 -21.32 -21.21 2.3e-09 1tgoA/T0087-1tgoA-2track-local 1tgoA 773 -41.23 -20.68 6.2e-09 8abp/T0087-8abp-2track-global 8abp 306 -20.22 -20.17 6.2e-09 2yhx/T0087-2yhx-2track-global 2yhx 457 -19.78 -19.14 1.7e-08 1gca/T0087-1gca-2track-global 1gca 309 -12.13 -16.57 3.4e-07 1bykA/T0087-1bykA-2track-local 1bykA 255 -35.78 -15.08 9.2e-07 1mrj/T0087-1mrj-2track-local 1mrj 247 -31.14 -15.26 9.2e-07 1qh8A/T0087-1qh8A-2track-local 1qh8A 478 -37.22 -15.15 9.2e-07 2yhx/2yhx-T0087-fssp-global T0087 310 -2.68 -10.90 1.4e-04 2ohxA/2ohxA-T0087-local T0087 310 -18.03 -7.20 2.2e-03 1uroA/T0087-1uroA-2track-global 1uroA 367 -3.50 -7.80 2.7e-03 1uroA/T0087-1uroA-2track-local 1uroA 367 -27.47 -7.01 2.7e-03 1uroA/T0087-1uroA-2track-local 1uroA 357 -27.49 -7.01 2.7e-03 2ohxA/2ohxA-T0087-fssp-global T0087 310 -20.81 -6.80 3.3e-03 2ohxA/2ohxA-T0087-vit T0087 310 -9.55 -5.17 1.7e-02 1uroA/T0087-1uroA-2track-global 1uroA 357 -2.94 -5.74 2.0e-02 3ladA/3ladA-T0087-global T0087 310 -5.04 -5.85 2.0e-02 2ohxA/2ohxA-T0087-global T0087 310 -110.62 -4.96 2.1e-02 1tbgE/1tbgE-T0087-global T0087 310 -67.83 -4.65 2.8e-02 1gca/1gca-T0087-fssp-global T0087 310 -30.60 -4.34 5.4e-02 8abp/T0087-8abp-2track-local 8abp 306 -25.23 -4.23 5.4e-02 8abp/T0087-8abp-2track-local 8abp 305 -25.19 -4.22 5.4e-02 1tbgE/1tbgE-T0087-fssp-global T0087 310 -96.76 -3.93 5.8e-02 1pjr/1pjr-T0087-fssp-global T0087 310 -58.32 -3.30 1.1e-01 2ohxA/T0087-2ohxA-global 2ohxA 374 -137.92 -3.23 1.1e-01 1c53/1c53-T0087-vit T0087 310 -3.54 -3.17 1.4e-01 1lvl/1lvl-T0087-global T0087 310 -2.62 -3.42 1.4e-01 1pjr/T0087-1pjr-2track-local 1pjr 623 -26.74 -3.71 1.4e-01 1pjr/T0087-1pjr-2track-local 1pjr 724 -26.33 -3.52 1.4e-01 1qhgB/T0087-1qhgB-2track-local 1qhgB 261 -17.99 -3.04 1.4e-01 3ladA/T0087-3ladA-2track-global 3ladA 472 -2.10 -3.89 1.4e-01 4hb1/4hb1-T0087-global T0087 310 -1.25 -3.28 1.4e-01 8abp/8abp-T0087-fssp-global T0087 310 -28.64 -3.30 1.4e-01 2ohxA/T0087-2ohxA-vit 2ohxA 374 -4.14 -2.79 1.7e-01 1pjr/1pjr-T0087-vit T0087 310 -6.22 -2.55 2.2e-01 1qhgB/1qhgB-T0087-vit T0087 310 -6.43 -2.47 2.3e-01 1pjr/1pjr-T0087-local T0087 310 -14.77 -2.40 2.5e-01 1gca/T0087-1gca-2track-local 1gca 309 -27.06 -2.14 3.6e-01 1qh8A/1qh8A-T0087-vit T0087 310 -5.84 -2.96 3.6e-01 1sro/T0087-1sro-2track-local 1sro 76 -12.48 -2.11 3.6e-01 I looked at the 1mrj/T0087-1mrj-2track-global alignment already---see notes above. Plasuible domain break and 2ry, but no conservation of active site. NNH (or GGH) of DHHA domain off end of alignment. I looked at 1bykA/T0087-1bykA-2track-global already--see notes above. Unlikely domain boundaries and doesn't align DHH of the DHH domain. 1qh8A/T0087-1qh8A-2track-global has impossible to close gap near beginning. DHH not conserved, and NNH or GGH of DHHA in insert region. 2yhx/T0087-2yhx-2track-local This hexokinase has a lot of unknown residues, making it a little hard to check the alignment. In fact, when compared to the yeast hexokinase B from genomic data, it can be seen that the residue assignment by the crystolographers is not very good. The D of DHH is conserved, but most of the DHHA domain is unaligned. Althoough secondary structure matches are good, the very low residue identity makes this an unsatisfactory match. 8abp/T0087-8abp-2track-global actually looks pretty good for the DHH domain, though not the DHHA domain. I hand-edited to get 8abp/T0087-8abp-karplus1.a2m which has 38 identical residues. I didn't like the matching to the DHHA domain, so I changed the alignment to get the DHH domain as a single domain, without the large insertion that 8abp has. 8abp/T0087-8abp-karplus2.a2m Note FSSP has 8abp similar to 2dhqA (3.16.12), 1dyx (3.2.1, 3.16.11), 1qczA (3.16.7), 1rvv1 (3.12.1), 1ak1 (3.87.1), 1gdhA (3.2.1,3.16.11), 1drw (3.2.1,4.66.1), 1psdA (3.2.1,3.16.11,4.48.17), 1iibA(3.39.2), 1cex (3.16.8), 1reqA (3.1.18,3.16.5), ... I interpret this to mean that the 3.16* folds have something significant in common with 3.88.1 and maybe the 3.2.1 folds do also. (We got a 3.16.1 fold before---1xzi, and we have some 3.2.1 hits (1bxkA and 2ohxA). The 8abp/T0087-8abp-karplus2.a2m for the first domain is currently my best guess. Sun Sep 3 13:15:37 PDT 2000 Kevin Karplus I tried splitting T0087 into two domains: 1-180 and 181-end The 1-180 domain still looks like a DHH domain, but the 181-end does not look much like a DHHA domain---that signal must have come from the homologs of the target. Indeed, PFAM_SWISSPROT on PPX1_STRMU ( O68579 ) liast the domains as DHH family and Pfam-B_8987 The Pfam-B_8987 family is quite small: O29502/192-319 FGVEIKAKLSAVDDLTAMDIIKRDYKDFDMSGKKVGVGQIELVDLSLIESRIDEIYEAMKKMKEEGGYAGIFLMLTDIMKEGTELLVVTDYPEVVEKAFGKKLEGKSVWLDGVMSRKKQVVPPLEKAF. Y608_METJA/179-306 FGMEILKAKSVVGKLKPEEIINMDFKNFDFNGKKVGIGQVEVIDVSEVESKKEDIYKLLEEKLKNEGYDLIVFLITDIMKEGSEALVVGN-KEMFEKAFNVKVEGNSVFLEGVMSRKKQVVPPLERAYN O68579/181-309 YGLAMLKAGTNLASKTAAQLVDIDAKTFELNGSQVRVAQVNTVDINEVLERQNEIEEAIKASQAANGYSDFVLMITDILNSNSEILALGNNTDKVEAAFNFTLKNNHAFLAGAVSRKKQVVPQLTESFN P95765/182-310 YGLAMLKAGTNLASKSAEELIDIDAKTFELNGNNVRVAQVNTVDIAEVLERQAEIEAAIEKAIADNGYSDFVLMITDIINSNSEILAIGSNMDKVEAAFNFVLENNHAFLAGAVSRKKQVVPQLTESFN YYBQ_BACSU/180-306 YGLNMLKAGADLSKKTVEELISLDAKEFTLGSKKVEIAQVNTVDIEDVKKRQAELEAVISKVVAEKNLDLFLLVITDILENDSLALAIGNEAAKVEKAFNVTLENNTALLKGVVSRKKQVVPVLTDA.. t87-1-180 blast hits (very weak) 1dm1A 1.36.1 2cua[AB] 2.5.1 no double-blast hits weak target model hits 1hur[AB] 3.31.1 1rrf 3.31.1 1rrg[AB] 3.31.1 1cz1A 3.1.7 [123]tmk[ABCDEF] 3.31.1 template hits 1bw9A -8.62 1.5749 3.2.1,3.53.1 1agnA -8.33 2.1046 2.33.1,3.2.1 2ohxA_2 -8.32 2.1258 3.2.1 2ohxA -8.17 2.4697 2.33.1,3.2.1 1gln_2 -6.94 8.4438 3.19.1 1ykfA -6.89 8.8763 2.33.1,3.2.1 weak 2track hits % Sequence ID Length Simple Reverse E-value SCOP 1mrp 309 -25.24 -13.01 7.7e-03 3.89.1 1cs1A 386 -31.56 -12.22 2.1e-02 3.62.1 1cl2A 395 -30.61 -11.76 5.7e-02 3.62.1 1bykA 255 -28.53 -11.25 5.7e-02 3.88.1 1sbp 310 -23.58 -9.93 4.2e-01 3.89.1 1cl1A 395 -28.76 -9.32 4.2e-01 3.62.1 2vil 126 -21.22 -9.14 4.2e-01 4.90.1 2nsyA 271 -25.01 -9.05 4.2e-01 ? Top alignments: 1gca/T0087-1-180-1gca-2track-global 1gca 309 -13.80 -17.74 1.2e-07 Seems to have grabbed pieces from two adjacent domains. big chunks of active site missing 1mrp/T0087-1-180-1mrp-2track-global 1mrp 309 -9.84 -17.18 1.2e-07 Some good conservation on helices and strands, but rather fragmented. I can move the gaps around to get a more compact domain 1mrp/T0087-1-180-1mrp-karplus.a2m but I'm not very impressed with the result. 5tmpA/T0087-1-180-5tmpA-2track-global 5tmpA 210 -18.64 -17.84 1.2e-07 This alignment has good 2ry matches, but almost no identical residues. Chunks of fold are missing. 1agnA/1agnA-T0087-1-180-global T0087-1-180 180 0.80 -16.84 3.4e-07 2ry match not great, and missing big chunks of fold. 1bxkA/T0087-1-180-1bxkA-2track-global 1bxkA 341 -6.71 -16.85 3.4e-07 The first 3 strands look ok here, but the rest seem inccorect. Realigning by hand got lots of alignments with good residue identity, but none were really convincing. 8abp/T0087-1-180-8abp-2track-global 8abp 305 -11.40 -14.09 2.5e-06 Looks promising, and when edited to 8abp/T0087-1-180-8abp-karplus.a2m, it looks quite good. It is quite similar to the earlier 8abp prediction. THIS IS CURRENTLY MY FAVORITE PREDICTION. 1mrp/T0087-1-180-1mrp-2track-local 1mrp 309 -25.24 -13.01 6.8e-06 not compact, and not clear how to change alignment to make it compact. 1sbp/T0087-1-180-1sbp-2track-global 1sbp 309 -0.31 -13.25 6.8e-06 not compact, and not clear how to change alignment to make it compact. 1bxkA/1bxkA-T0087-1-180-fssp-global T0087-1-180 180 -14.73 -12.98 1.8e-05 compact, though somewhat gappy alignment. Gets good first sheet. This is fairly consistent with the 8abp prediction. 1cs1A/T0087-1-180-1cs1A-2track-local 1cs1A 384 -31.56 -12.22 1.8e-05 low residue id and missing interior beta strand. I'm running out of time. Let's say 8abp/T0087-1-180-8abp-karplus.a2m as model 1 1bxkA/1bxkA-T0087-1-180-fssp-global as model 2 t87-181-end weak blast hits 1fz[abcdefg][AD] 8.1.8 1gab 1.8.1 no double-blast hits one weak target hit 1be1 3.16.5 weak target hits in T0087-181-end-t2k-t99.rdb 1ai1H -6.67 11.0578 2.1.1,2.1.1 1b1a -6.27 16.4860 3.16.5 1uch -6.22 17.3296 4.3.1 1rlaA -6.19 17.8563 3.36.1 2rlaA -6.19 17.8563 3.36.1 3rlaA -6.18 18.0354 3.36.1 weak template hits 1aisB -5.76 27.4196 1.72.1,1.72.1 1nwpA -5.47 36.6056 2.5.1 weak 2-track hits % Sequence ID Length Simple Reverse E-value SCOP 1gdoA 240 -22.74 -10.82 1.6e-01 4.132.1 1gdoB 240 -21.27 -9.17 4.2e-01 4.132.1 3bamB 213 -22.03 -8.86 1.1e+00 3.47.1 1b74A 254 -22.11 -8.69 1.1e+00 3.72.2,3.72.2 1bam 213 -23.22 -8.65 1.1e+00 3.47.1 1c0aA 585 -21.03 -7.46 3.1e+00 2.38.4,4.59.4,4.87.1 487dL 133 -18.09 -7.44 3.1e+00 ? Top alignments: 1gdoA/T0087-181-end-1gdoA-2track-local 1gdoA 240 -22.74 -10.82 1.4e-04 11 identical residues in short gapless alignment, with pretty good secondary match to 5 elements HEEHE. 1be1/T0087-181-end-1be1-global 1be1 137 0.71 -10.07 1.4e-04 23 identical residues, with good 2ry match Hand-edited to 1be1/T0087-181-end-karplus.a2m to cover first beta strand, has 25 identical residues covering almost entire fold. With a little more editing we get 27 identical residues: 1be1/T0087-181-end-karplus2.a2m. A little more dinking around raises it to 28: 1be1/T0087-181-end-karplus3.a2m. Note: all helices on same side of 4-strand sheet. MAY BE ABLE TO PREDICT WITH THIS. 1gdoB/T0087-181-end-1gdoB-2track-local 1gdoB 240 -21.27 -9.17 3.7e-04 same as 1gdoA 3bamB/T0087-181-end-3bamB-2track-local 3bamB 213 -22.03 -8.86 1.0e-03 12 identical residues with 1 insertion, good 2ry match: EHEHHE 1b74A/T0087-181-end-1b74A-2track-local 1b74A 254 -22.11 -8.69 1.0e-03 18 identical residues with 1 insertion, pretty good 2ry match. Can be extended to 24 identical residues with 1 more gap. This looks good, but not as good as 1be1. Note helices on both sides of 4-strand sheet, plus extra strand from different sheet. 1uch/T0087-181-end-1uch-vit 1uch 230 -11.60 -7.83 2.7e-03 Only 9 identical residues and missing strand. 1aisB/1aisB-T0087-181-end-vit T0087-181-end 130 -10.60 -7.43 2.7e-03 13 identical residues with 2-residue gap. Poor 2ry match (all-helical template) 1rlaA/T0087-181-end-1rlaA-vit 1rlaA 323 -9.47 -7.31 2.7e-03 5 identical residues in very short match. 1be1/T0087-181-end-1be1-local 1be1 137 -16.17 -7.23 2.7e-03 See 1be1 editing above. 1be1/T0087-181-end-1be1-vit 1be1 137 -8.76 -6.58 7.4e-03 See 1be1 editing above. 1rlaA/T0087-181-end-1rlaA-local 1rlaA 314 -15.21 -6.24 7.4e-03 1nwpA/1nwpA-T0087-181-end-vit T0087-181-end 130 -9.40 -6.22 7.4e-03 1uch/T0087-181-end-1uch-local 1uch 230 -15.24 -6.21 7.4e-03 1rlaA/T0087-181-end-1rlaA-local 1rlaA 323 -15.18 -6.20 7.4e-03 1uch/T0087-181-end-1uch-local 1uch 206 -15.37 -6.18 7.4e-03 1aisB/1aisB-T0087-181-end-local T0087-181-end 130 -14.67 -5.77 2.0e-02 Out of time let's go with 8abp/T0087-1-180-8abp-karplus.a2m as model 1 1bxkA/1bxkA-T0087-1-180-fssp-global as model 2 with 1be1/T0087-181-end-karplus3.a2m for second domain on both models. Tue Sep 5 10:35:27 PDT 2000 remaking 2track