Thu Jul 21 22:05:00 PDT 2005 Kevin Karplus

I had an interesting, but perhaps useless idea.  We might be able to
improve neural nets (at least for design purposes) by training not
just on correct input/ouput pairs but also on random input sequences
with the background at the outputs.  This would train the neural net
to recognize protein-like sequences as well as classifying them.
There may be enough paraameters in the neural net to take on this
additional role.  

Another possibility is just to train a single neural net to recognize
protein-like sequences (that is, with a constant 1 output for real
sequences and a constant 0 output for random inputs).

Sun Jul 17 14:28:08 PDT 2005 Kevin Karplus

It would be good to have a more general mechanism than FreezeDesign
for constraining the design algorithm.  It would be good, for example,
to prohibit certain residues in all (or specific) positions, or to
prohibit the native residue in certain positions.

Sun Jul 17 14:29:36 PDT 2005 Kevin Karplus

It would be nice for the design algorithm to be able to start from a
multiple alignment (perhaps of previously generated designs).

Sat Jul  9 08:51:00 PDT 2005 Kevin Karplus

Add optional weights to different tracks for designing to multiple networks.
Add ability to choose most probable sequence in Design1st.

Sat Jul  9 04:23:25 PDT 2005 Kevin Karplus

It might be interesting to try batch update of weights, with an
OptimizeOnLine optimization to choose the step size.

Sat Jul  9 04:25:41 PDT 2005 Kevin Karplus

It might be interesting to try defining correct answers not by having
a known right labeling, but by having pairwise (or multiple)
alignments of inputs and score the correctness by the co-emission
probability or by symmetrized cross-entropy (sum p_i log(q_i) + q_i log(p_i)).
This would allow learning a labeling at the same time as learning how
to predict it.


8 Feb 2004 Kevin Karplus

It would be good to have the ability to have multiple interface
descriptions at any layer, with arbitrary inputs from earlier layers,
so that we could have our neural net be a DAG.  This could be useful
for generating predictions for multiple alphabets from some common
recoding, for example, or for including the primary inputs in later
levels of the neural net.

I'd like to have a multiple-alphabet output from a network (made by
running several networks in parallel), so that I can use
back-propagation from desired local structure properties to do protein
design. 


------------------------------------------------------------

I want to modify InterfaceDescription, so that we can have a guide
sequence+profile for input.  These can come from a single a2m file, if
we use the convention that the first sequence is the guide sequence
(optionally, allow providing the name for the guide sequence).
[DONE BY SOL KATZMAN spring 2004]


Mon Jun 13 21:37:30 PDT 2005 Kevin Karplus

Bug in QualityRecord: the way bits gained is reported is bogus!
It currently reports the difference between the cost of the letter
with the prediction and the *average cost* of letters, rather than the
background cost of the particular letter.
[DONE 18 June 2005]

For Design1st, try collecting all predictions (or best predictions)
and doing a second round, starting from a profile based on the
predictions. 
[DONE 15 June 2005]

Fri May 27 17:49:56 PDT 2005 Kevin Karplus

Add commands to read background probabilities, rather than taking them
from the training set.  This will allow measurement of information
gain for predictions of single proteins.
[Done 18 June 2005, improved 7 July 2005]

8 Feb 2004 Kevin Karplus

We need to check that this version of predict-2nd runs correctly,
after the minor modifications that were made for the new c++ compiler
were made.

[DONE]

I want to train neural networks for several alphabets, using the
guide+profile input.  I suspect that the profile could be generalized
further than we currently use, since we'd have the guide sequence
available to characterize the close homologs.

[mostly DONE, need to redo train-test validation for more alphabets
July 2005]

Fri May 27 17:49:56 PDT 2005 Kevin Karplus

Add ability to have multiple networks read in at once, with common
input interface, so that backpropagation can be done from multiple
local structure alphabets.  Note: this is simpler than the more
general approach of having multi-output neural nets.
[DONE for the Design1st algorithm, 9 July 2005]