Wed Jul 26 12:21:55 PDT 2000 Kevin Karplus T0118 Endodeoxyribonuclease I, Bacteriophage T7 Dimer, residues 17-145 visible in structure. wu-blast: weak hits to 1bx[uv]A very weak to 1hcrA, [13]egf, 1ep[ghij], 1tgsI, ... double-blast: no hits Only 4 sequences (almost identical) in T2k alignment, none in PDB. Predicted structure mixed helical and beta. The target model gets only weak hits: ID E-value FSSP SCOP 1a5[ab]A 21. 2tysA 3.1.2.2.5 1beuA 21. 2tysA 3.1.2.2.5 1b78[AB] 22. 1b78A 3.46.3.1.1 2mjp[AB] 22. 1b78A 3.46.3.1.1 2tsyA 22. 2tysA 3.1.2.2.5 1c9dA 23. 2tysA 3.1.2.2.5 1ubsA 23. 2tysA 3.1.2.2.5 1bksA 23. 2tysA 3.1.2.2.5 2trsA 23. 2tysA 3.1.2.2.5 1tt[pq]A 23. 2tysA 3.1.2.2.5 1c29A 23. 2tysA 3.1.2.2.5 2tysA 23. 2tysA 3.1.2.2.5 1a5sA 23. 2tysA 3.1.2.2.5 1c8vA 23. 2tysA 3.1.2.2.5 2wsyA 23. 2tysA 3.1.2.2.5 1cw2A 23. 2tysA 3.1.2.2.5 1cx9A 23. 2tysA 3.1.2.2.5 1a50A 23. 2tysA 3.1.2.2.5 There are some higher scoring template models 1anf 6.5 4mbp 3.89.1.1.6 4mbp 13.5 4mbp 3.89.1.1.6 1euu 24.9 1eut 2.1.1.5.16, 2.17.1.1.2, 2.63.1.1.4 1mwpA 33.2 1mwpA ? No 2-way hits, even at the fold level. Sat Aug 26 15:10:30 PDT 2000 Kevin Karplus Remade 2track predictions Still no good hits: % Sequence ID Length Simple Reverse E-value SCOP 1qo7A 394 -18.10 -7.75 3.1e+00 3.64.1 3grs 478 -17.81 -6.87 8.5e+00 3.3.1,4.72.1 1b12A 248 -18.59 -6.06 8.5e+00 2.82.1 1opr 213 -16.14 -6.04 8.5e+00 3.56.1 1avqA 228 -16.84 -5.95 2.3e+01 3.47.1 Tue Sep 5 10:35:23 PDT 2000 remaking 2track 7 Sep 2000 Rachel Karchin Functional info: Parkinson et. al. Nucleic Acids Res 1999 Jan 15;27(2):682-9 Abstract: "Endonuclease I is a 149 amino acid protein of bacteriophage T7 that is a Holliday junction-resolving enzyme, i.e. a four-way junction-selective nuclease. We have performed a systematic mutagenesis study of this protein, whereby all acidic amino acids have been individually replaced by other residues, mainly alanine. Out of 21 acidic residues, five (Glu20, Glu35, Glu65, Asp55 and Asp74) are essential. Replacement of these residues by other amino acids leads to a protein that is inactive in the cleavage of DNA junctions, but which nevertheless binds selectively to DNA junctions. The remaining 16 acidic residues can be replaced without loss of activity. The five critical amino acids are located within one section of the primary sequence. It is rather likely that their function is to bind one or more metal ions that coordinate the water molecule that brings about hydrolysis of the phosphodiester bond. We have also constructed a mutant of endonuclease I that lacks nine amino acids (six of which are arginine or lysine) at the C-terminus. Unlike the acidic point mutants, the C-terminal truncation is unable to bind to DNA junctions. It is therefore likely that the basic C-terminus is an important element in binding to the DNA junction." This paper shows the location of the catalytically essential acidic residues in our target and includes a pair-wise alignment of endonuclease I in phages T7 and T3 with functionally important residues highlighted. The paper is available at: http://nar.oupjournals.org/cgi/content/full/27/2/682 Sun Sep 10 14:09:08 PDT 2000 Kevin Karplus I looked at the paper, and saw that they had identified 5 catalytically essential residues E20,E35,D55,E65,D74 and also observed that removing the last 9 residues from the protein eliminated DNA binding. Since we have few homologs, maybe we can increase the specificity of the model by adding an XXX... chain with these important residues. The t2k alignment is the same, but the extra sequence with the active site residues does increase (slightly) the conservation expected in those columns. The target model still finds nothing, but the 2ry prediction of a helix after D74 is now cleaner---so we should at least use this one for the 2ry prediction. The 2-track predictions from t118-act are % Sequence ID Length Simple Reverse E-value SCOP 1opr 213 -17.87 -8.74 1.2e+00 3.56.1 1avqA 228 -17.19 -7.02 3.2e+00 3.47.1 1ystH 260 -14.85 -5.55 2.3e+01 2.39.1,6.2.1 1cgt 684 -15.48 -5.28 2.3e+01 2.1.1,2.3.1,2.66.1,3.1.7 1a48 306 -14.95 -5.22 2.3e+01 4.122.1 combined target-template predictions 1anf -8.15 2.56467315846984 -- -- -8.15 3.0e+00 3.89.1 4mbp -7.42 5.32025253313539 -- -- -7.42 8.1e+00 3.89.1 1d2fB -7.00 8.09468986224974 -7.00 8.1e+00 -- -- 3.62.1 1euu -6.82 9.68936231489074 -- -- -6.82 2.2e+01 1.2.1,2.17.1 1mwpA -6.54 12.8157680841938 -- -- -6.54 2.2e+01 ? 1pyaB -6.48 13.6070372667242 -- -- -6.48 2.2e+01 4.134.1 1omp -6.42 14.4470813170324 -- -- -6.42 2.2e+01 2.12.1 (now 1ompA) 1tgoA -6.28 16.6140395281418 -- -- -6.28 2.2e+01 3.50.3,5.8.1 1trb -6.10 19.8832841115402 -- -- -6.10 2.2e+01 3.3.1,3.3.1 2pviA -6.08 20.2840361033747 -6.08 2.2e+01 -- -- 3.47.1 2gar -6.05 20.9003238481408 -- -- -6.05 2.2e+01 3.60.1 1pya_1a1 -5.99 22.1895000325223 -- -- -5.99 5.9e+01 4.134.1 Hmm, 3.47.1 comes up twice in target models. Best alignments: 1opr/T0118-act-1opr-2track-local 1opr 213 -17.87 -8.74 1.0e-03 long gapless alignment with good 2ry match, but missing first (interior) strand of sheet. 1avqA/T0118-act-1avqA-2track-local 1avqA 228 -17.19 -7.02 2.7e-03 short gapless alignment with good residue identity and good 2ry match. Can plausibly be extended to almost full-length match 1avqA/T0118-act-1avqA-karplus.a2m By rearranging a bit, can even get clustering of 4 of the 5 active site residues: 1avqA/T0118-act-1avqA-karplus2.a2m I'm running out of time, so let's predict this. 1anf/1anf-T0118-act-local T0118 149 -19.39 -7.96 3.6e-03 1anf/1anf-T0118-act-vit T0118 149 -10.36 -7.17 3.6e-03 2pviA/T0118-act-2pviA-vit 2pviA 156 -10.52 -7.21 3.6e-03 2pviA/T0118-act-2pviA-vit 2pviA 157 -10.51 -7.21 3.6e-03 4mbp/4mbp-T0118-act-local T0118 149 -18.70 -7.33 3.6e-03