Proteins are the macromolecules that preform almost all of the cell's work. Proteins are used for energy, communication, ezymatic activity, communication, transport, and many other things. Proteins are very complex and versitile macromolecules. Proteins are involved in almost every cellular process and come in many shapes and sizes. however all proteins are made up of only twenty amino acids. From a chemical point of view proteins are just heterogenous polymers composed of the twenty amino acids.
Each of the amino acids have different chemical structures and
characteristics. By combining the properties of these different amino
acids proteins can preform a diverse range of functions. Amino acids
have two forms of shorthand, a one character code and a three
character code. Table 1 lists the amino acids in alphabetical order along
with the corresponding shorthand and chemical characteristics. For more
information see the Introduction
to Biomolecular Modeling at Tufts
| Amino Acid | Code | Code | hydrophobic | Polar | Aromatic | Aliphatic | Size |
|---|---|---|---|---|---|---|---|
| Alanine | Ala | A | y | tiny | |||
| Arginine | Arg | R | y+ | ||||
| Asparagine | Asn | N | y | small | |||
| Aspartic Acid | Asp | D | y- | small | |||
| Cysteine | Cys | C | y | small | |||
| Glutamic Acid | Glu | E | y- | ||||
| Glutamine | Gln | Q | y | ||||
| Glycine | Gly | G | y | tiny | |||
| histidine | his | h | y | y+ | y | ||
| Isoleucine | Ile | I | y | y | |||
| Leucine | Leu | L | y | y | |||
| Lysine | Lys | K | y | y+ | |||
| Methionine | Met | M | y | ||||
| Phenyalanine | Phe | F | y | y | |||
| Proline | Pro | P | small | ||||
| Serine | Ser | S | y | tiny | |||
| Threonine | Thr | T | y | y | small | ||
| Tryptophan | Trp | W | y | y | y | ||
| Tyrosine | Tyr | Y | y | y | y | ||
| Valine | Val | V | y | y | small |
One of the big challenges in bioinformatics today is to be able to predict seconndary and tertiary structure of proteins given the primary sequence. Right now it is practically impossible to predict tertiary structure of a protein given only the primary sequence of amino acids.
Another challenge in bioinformatics is to predict the function of a
protein given the primary sequence. It is possible to do this even
though we cannot predict the tertiary structure of a protein and it is
that tertiary sturcture that determines the function. The trick is to
compare the new sequence to other sequences that we know the function
of. It turns out that nature is very conservative, once something
works nature may tinker with it but often the basic functionality will
remain the same. An example of this is hemeoglobin, the protein that
carries oxygen in blood. Many different forms of this protein exist in
different species, however they all have a group of core of amino
acids that are the same. These homologous regions leads to
subsequences of proteins that have specific functionality and are
conserved in nature, these are calledmotifs. By finding motifs
that we know in novel proteins can predict the function of the novel
protein even though we don't know it's 3-dimensional structure. Other
proteins are not so obviously related at the primary structure (amino
acid) level but are related at the secondary and tertiary structure
levels. For example myoglobin and hemoglobin have strutural
similarities, only about 20% of the amino acids are the same.
(thanks to Dr. "hank" for the correction)