# These sequences are ones that one or another of the neural nets # had trouble with---in some cases even though the chain was in the # training set. I have tried to figure out (for some) why the neural # nets had trouble---particularly to decide whether the DSSP file or # the sequence file is bad, so that the t99 alignment needs to be # re-created or the chain dropped from training and test sets. # # Particularly interesting are cases where one chain of PDB file is # highly predictable and another is highly unpredictable: # predicatable unpredicatable # 1kzuB 1kzuA # 1lghB 1lghA # 1ytfB 1ytfC # Here are the chains that were rejected from the training set: # 1by0 2.5216 0.2593 0.1852 2.0290 1.0000 1.0000 0.1852 DSSP wrong, multiple models # 1octC ? two chains, not properly separated # 1psm 1.8694 0.3158 0.1496 1.3226 1.0000 0.1873 0.1379 crystallographers say helix, DSSP says turn # Here are chains that caused problems in some network, together with the # scores from t99-2877-IDaa13-9-6-11-9-5-8-9-ehl-seeded2-trained #ID bits Q3 SOV objective SOV(E) SOV(H) SOV(L) comment 1ba4 2.1126 0.3250 0.4350 1.5701 1.0000 0.0455 0.9111 3monB 1.5868 0.3800 0.3535 1.0301 0.2657 0.1765 0.7611 strand part of sheet with other chains 1ktx 1.2385 0.4054 0.4189 0.6236 0.5000 0.0000 0.4536 3ezmA 1.7663 0.4257 0.4345 1.1233 0.4897 0.1200 0.4330 ? not compact 1rip 1.7575 0.4321 0.1963 1.2272 0.3095 1.0000 0.1943 unusual knotted structure (real?) 1cfh 1.5987 0.4468 0.1702 1.0668 1.0000 0.8571 0.1534 ? 1ba6 1.6488 0.4500 0.4515 0.9730 1.0000 0.0909 0.5417 1pij 1.7114 0.4521 0.3568 1.0809 0.1667 0.2500 0.4100 ? 1molA 1.5834 0.4574 0.4195 0.9162 0.3949 0.1765 0.7483 ? dimerization interface 1bg8A 1.3948 0.4605 0.4641 0.7022 1.0000 0.4933 0.4336 1lpbA 1.5455 0.4706 0.3460 0.9018 0.4035 0.0000 0.3698 ? 1fzgD 1.8652 0.4717 0.3530 1.2170 1.0000 0.3263 0.3934 1bev4 1.3929 0.4750 0.3817 0.7270 1.0000 0.1714 0.4176 1aml 1.7385 0.4750 0.6056 0.9607 1.0000 0.0667 0.9289 1aa0 1.6165 0.4779 0.2714 1.0029 1.0000 0.1128 0.9493 ? 1tvt 1.2896 0.4800 0.1787 0.7202 1.0000 0.0000 0.1812 2dynA 1.5135 0.4865 0.4980 0.7780 0.4792 1.0000 0.4042 1nubA 1.7249 0.4867 0.3009 1.0878 0.4720 0.1032 0.4727 1gpt 1.4420 0.4894 0.3802 0.7626 0.3088 0.2727 0.4628 2hipA 1.7833 0.4930 0.3121 1.1343 0.1667 0.1846 0.3519 2erl 2.0403 0.5000 0.3809 1.3499 1.0000 0.2941 0.6077 helices joined by disulphides 1erp 1.7417 0.5000 0.4349 1.0242 1.0000 0.2967 0.6250 1sso 1.3602 0.5000 0.4469 0.6368 0.4580 0.6000 0.4286 ? hairpins not quite seen by DSSP 1pfsA 1.5011 0.5000 0.5746 0.7137 0.5504 1.0000 0.6155 ? dimerization interface 2mev4 1.7384 0.5000 0.7069 0.8849 1.0000 1.0000 0.6852 1qckA 1.4676 0.5056 0.4531 0.7354 1.0000 0.3276 0.6880 2ezxA 1.4676 0.5056 0.4531 0.7354 1.0000 0.3276 0.6880 2sn3 1.1950 0.5077 0.3711 0.5017 0.3048 0.0000 0.4188 lots of disulphide bridges 1aho 1.4163 0.5156 0.4355 0.6829 0.7460 0.0000 0.3988 lots of disulphide bridges 1bl1 2.2384 0.5161 0.5753 1.4347 1.0000 0.5217 0.7292 1pht 1.4147 0.5181 0.3510 0.7211 0.4931 0.2857 0.3015 1lepA 1.4910 0.5185 0.3152 0.8149 0.4433 0.0000 0.3022 1kveA 1.6320 0.5238 0.4433 0.8865 0.6000 0.7308 0.2467 edge strands of sheet aren't connected in 1kveA (middle is 1kveB) 2std 1.4760 0.5247 0.4804 0.7111 0.3491 0.8038 0.4745 1erd 1.6545 0.5250 0.5275 0.8657 1.0000 0.3542 0.7875 1vib 1.4600 0.5273 0.3303 0.7675 1.0000 0.1874 0.7473 1qa7B 1.7403 0.5274 0.4904 0.9677 0.4359 0.3813 0.5733 1b9wA 1.5374 0.5281 0.3864 0.8161 0.3167 0.0000 0.5080 1a73A 1.4990 0.5309 0.4674 0.7345 0.3467 0.2333 0.6043 3stdA 1.5230 0.5309 0.5004 0.7420 0.3821 0.7698 0.4920 1efm 1.9110 0.5316 0.2805 1.2391 0.5143 0.7441 0.2284 1b2vA 1.5101 0.5318 0.5032 0.7267 0.2054 0.6607 0.6495 1gps 1.3805 0.5319 0.4091 0.6440 0.3529 0.3000 0.4773 ? 1ltsC 1.1531 0.5366 0.3762 0.4284 1.0000 0.3906 0.2991 ? 1b4r 1.4791 0.5375 0.4509 0.7161 0.4102 1.0000 0.5212 6rlxA 1.3279 0.5417 0.5496 0.5115 0.0000 0.6875 0.3651 disulphides to other chains 1bm4 1.7538 0.5486 0.2375 1.0865 0.0252 1.0000 0.2504 1vfyA 1.4003 0.5522 0.5647 0.5658 0.2500 0.5556 0.7222 lots of disulphide bridges 1hvc 1.8915 0.5616 0.4992 1.0803 0.3892 0.6364 0.6284 3vub 1.2660 0.5644 0.5904 0.4064 0.4842 0.7273 0.6106 ? 1ytfC 1.2606 0.5652 0.4810 0.4549 0.4760 1.0000 0.5208 ? dimerization interface 1gatA 1.2580 0.5667 0.4198 0.4814 0.3750 0.1884 0.5009 ? 1vqb 1.3550 0.5698 0.5059 0.5323 0.5120 0.0000 0.5756 ? helix interrupting beta hairpin? 2pde 1.4084 0.5714 0.2222 0.7259 1.0000 1.0000 0.1942 1mctI 1.4816 0.5714 0.2645 0.7780 0.0000 0.0000 0.3897 lots of disulphide bridges 1lghA 2.2683 0.5714 0.5970 1.3984 1.0000 0.6201 0.4268 2mprA 1.5894 0.5748 0.4902 0.7695 0.4852 0.0000 0.5111 1vpu 1.4613 0.5778 0.4317 0.6676 1.0000 0.5116 0.3318 1bh7 1.5453 0.5806 0.4727 0.7283 1.0000 0.0000 0.5200 1kp6A 1.2670 0.5823 0.5418 0.4138 0.5000 0.5556 0.5564 lots of disulphide bridges 2a0b 1.2510 0.5847 0.5049 0.4138 1.0000 0.4841 0.6427 ? bent helices 1cv8 1.2378 0.5954 0.6275 0.3287 0.6650 0.4681 0.6893 ? 1xtcC 1.0812 0.6000 0.2285 0.3670 1.0000 0.2266 0.3000 1b0y 1.2260 0.6000 0.3109 0.4705 0.2500 0.2809 0.3253 ? 9wgaA 1.2164 0.6000 0.3165 0.4582 0.0000 0.1122 0.4092 lots of disulphide bridges 1arb 1.3400 0.6008 0.5021 0.4882 0.5273 0.1385 0.5934 ? 1bnx 1.3117 0.6061 0.6623 0.3745 1.0000 0.4737 0.9184 1koe 1.2721 0.6105 0.4776 0.4228 0.4276 0.6377 0.4563 ? many 1-long strands 1tca 1.2468 0.6151 0.4835 0.3899 0.4615 0.3858 0.5390 ? 2por 1.3305 0.6179 0.5648 0.4301 0.6629 0.1944 0.5153 porin (membrane protein) 1ab3 1.5650 0.6250 0.6241 0.6279 1.0000 0.6391 0.6156 6rlxB 1.3149 0.6296 0.6698 0.3504 0.0000 0.8676 0.4167 peptide joined to others with disulphides 1g3p 1.2732 0.6302 0.3740 0.4560 0.2485 0.3846 0.5005 ? 1fglB 1.2654 0.6667 0.4931 0.3522 1.0000 0.0000 0.7396 short peptide--DSSP helix not really there 1qqp2 1.0363 0.6667 0.6030 0.0682 0.5556 0.3200 0.7220 ? 1pnbA 1.6560 0.6774 0.5113 0.7229 1.0000 0.4906 0.5333 1spf 1.1262 0.6857 0.5276 0.1767 1.0000 0.4331 1.0000 lipoprotein---valine-rich helix 1kzuA 2.0233 0.6939 0.8503 0.9042 1.0000 0.8649 0.8056 2bbkL 1tyv 1isuA 1qcxA 1pidA 6cel 153l 1bfg 1sluA 2bbkH 1sbwI 1orc 1vie 1gof