Wed Jul 5 11:52:47 PDT 2000 Kevin Karplus T0105 appears to be one domain of SP10_HUMAN (the so-called SAND domain (or KDWK domain) from Sp100b, according to the submitters). Predicted to be DNA binding, by the submitters. Best WU-BLAST and only double-blast hits to 1itbB and 1iraY (E-value about 0.015). SAM-T2K alignment has 55 sequences, but many seem to be identical copies. 2ry structure predictions starts with strands and ends with helices. Target model finds ID E-value FSSP SCOP 1a3s 3.9 1u9aA 4.18.1.1.5 1u9aA 3.9 1u9aA 4.18.1.1.5 1u9b 3.9 1u9aA 4.18.1.1.5 1tie 15 1wba 2.40.4.1.2 Template models find ID E-value FSSP SCOP 1scmA 33 1wdcA ? Tue Jul 11 15:49:29 PDT 2000 Kevin Karplus 1u9aA/T0105-1u9aA-vit Compact, has insertion at point of loop, and small gap in helix. Initial beta strand part of sheet that is not aligned. Secondary-structure prediction doesn't match (we're aligning to helices here). 1u9aA/1u9aA-T0105-global Compact, has deletion in loop (but also missing an edge strand). Secondary-structure prediction doesn't match at all. 1u9aA/1u9aA-T0105-local similar to 1u9aA/1u9aA-T0105-global 1u9aA/1u9aA-T0105-fssp-global not compact. 1u9aA does not look very promising right now, but it is possible. 1tie/T0105-1tie-vit has good conservation (including cross-sheet striping), but does not cover all of the 1tie fold. There is an insertion at the point of a beta hairpin. If T0105 dimerizes, it may be able to make the 1tie structure??? This looks more interesting than the 1u9aA prediction. 1tie/T0105-1tie-local does not look nearly as good. 1tie/1tie-T0105-vit is too short to be useful. 1tie/1tie-T0105-local is better, but still too short. 1tie/1tie-T0105-global looks moderately ok, but T0105 isn't big enough to cover all of 1tie. 1wba/1wba-T0105-vit does not have a longe enough aligment to be useful. 1wba/1wba-T0105-local is still too small. Tue Jul 18 11:59:20 PDT 2000 Kevin Karplus CAFASP Folds SCOP num num num server found first 2.1 9 15 4 2.47 5 8 2 7.14 5 5 2 4.77 2 5 1 10 2.1.1 8 2.47.1 5 7.14.1 5 4.77.1 5 3.31.1 We did not find any of these popular hits, but since our hits don't look great, we may want to examine at least some 2.1.1 hits. Tue Aug 8 12:47:03 PDT 2000 Kevin Karplus Top 2-track hits are Sequence length Simple Reverse E-value FSSP SCOP 1pyaB 228 -17.52 -6.93 8.5e+00 1pyaB ? 1aa8A 347 -18.08 -6.68 8.5e+00 1an9A 3.4.1, 4.14.1 1gpeA 587 -18.67 -6.60 8.5e+00 1gpeA 3.3.1, 4.14.1 1dmr 823 -16.50 -5.86 2.3e+01 1dmr 2.49.2, 3.75.1 3chbD 104 -16.12 -5.36 2.3e+01 3chbD 2.38.2 1extA 162 -15.64 -5.24 2.3e+01 1extA 7.24.1 1fbnA 231 -14.63 -5.13 2.3e+01 1fbnA ? 1an9A 340 -15.69 -5.07 2.3e+01 1an9A 3.4.1, 4.14.1 (E-value computation is wrong) Our strong strand and helix predictions make 1.* and 2.* folds look unlikely. SAND domain was named in Trends Biochem Sci 1998 Jul;23(7):242-4 The APECED polyglandular autoimmune syndrome protein, AIRE-1, contains the SAND domain and is probably a transcription factor. Gibson TJ, Ramu C, Gemund C, Aasland R I was unable to print this from Acrobat version 3. They predict it to be all beta (using PhD on a smaller alignment). The mention that all-beta DNA-binding domains are unusual, but not unheard of, giving NFAT and Nf-kappa-b domains as examples. The Nf-kappa-b domain is represented by 1a3qA in FSSP (2.1.1, 2.2.5), the NFAT ones by 1a02N (2.1.1, 2.2.5) (but 1a02F and 1a02J are long helices in the major groove). So now we have an idea why the 2.1 fold hits may be relevant. There is a Pfam HMM for the SAND domain, with 31 sequences (we have 55, but many are not full length matches, so Pfam would reject them). 1a02N/1a02N-T0105-vit is a tiny fragment of 9 residues of which 6 are identical, half of a loop connecting 2 adjacent beta strands of beta sandwich, next to the DNA. 1a02N/1a02N-T0105-local extends this to 14 residues with 7 identical. 1a02N/T0105-1a02N-2track-local has only 3 identical residues, and is in the domain away from the DNA. 1a02N/T0105-1a02N-local is an empty alignment. 1a3qA/1a3qA-T0105-local is an empty alignment. 1a3qA/T0105-1a3qA-2track-local aligns two adjacent beta strands with 4 identical residues (nicely conserved across the sheet). The fragmentary match for 1a02N is very interesting, in that it seems to interact with DNA, but as a fold prediction it is poor. We can extend it in the N-terminal direction to get a strand of the beta sandwich, but we need some helices in the C-terminal direction which are not part of the template. I suspect that the helices also interact with the DNA, perhaps in the major groove. I suspect that t105 will turn out to be a new fold, but that the 1a02N-T0105-local alignment will have a similar relationship to the DNA. Sat Aug 26 00:24:56 PDT 2000 Remade 2track predictions Sat Aug 26 14:24:29 PDT 2000 Kevin Karplus Remade 2track predictions No strong hits: % Sequence ID Length Simple Reverse E-value X count 1pyaB 228 -18.36 -7.48 3.1e+00 1gpeA 587 -19.47 -7.24 3.1e+00 1dmr 823 -17.52 -6.71 8.5e+00 1aa8A 347 -17.99 -6.32 8.5e+00 Nothing new here, and predictions still weak.