Microarray FAQ
The following is meant to help people understand what DNA microarrays
are, how they are made, and why they are useful. This is a work in progress, if
you have any corrections, clarifications or a new question don't hesitate to
email me at
sugnet@cse.ucsc.edu.
The Questions:
The Answers:
A generic microarray consists of multiple features (spots) of DNA
which will be used to determine the levels of mRNA expression in a
collection of cells. The DNA for each feature is from a gene of
interest and is a probe for the mRNA encoded by that gene. In general
you can think of a microarray as a grid of DNA spots. Each spot has a
unique DNA sequence, different from the DNA sequence of the other
spots around it. Thus each spot will hybridize only to its
complementary DNA strand. In this way each spot is acting as a probe
to determine the levels of a specific mRNA produced by a collection of
cells.
The basic idea of using a piece of DNA as a probe to determine the
presence of the complementary DNA in a solution goes back a long
time. It is the same general technique used in Southern, and Northern
blots by molecular biologists every day. The thing that makes
microarrays so exciting now is the number of DNA probes that it is
possible to place on a microarray. Already there are microarrays with
probes for every gene in Yeast, and others with over 19,000 human
cDNAs. This allows researchers to observe the response of whole
genomes to various stimuli instead of one gene at a time.
 |
To the left is an image of a portion of a microarray that has
been hybridized and scanned. Each spot gives information as to the
relative abundance of the mRNA which is complementary to the DNA in
that spot. In this manner it is possible to gather mRNA expression
levels for thousands of genes at the same time in parallel.
|
Currently microarrays come in many different types but they are only two
real fabrication methods.
- Synthesize DNA probes separately, using PCR for cDNAs or chemical
synthesis for oligonuclotides. Then use a robot to spot these DNA
probes onto your microarrays into very small grids. The substrate for
your microarrays can be glass, plastic, or even nylon membranes. Most
labs I know of are using glass microscope slides. This method is
compartively cheap and flexible and Pat Brown's lab posted plans to make the
robot on the web. Some related technologies use an ink-jet like
printer to spray oligonucleotide probes on the microarrays.
- Synthesize DNA oligonucleotides directly on the microarray using
UV-masks and photo-activated chemistry. Currently Affymetrix are the only people
doing this. Affymetrix is a dominant player in commercial microarrays
and the chemistry they use is just amazing. The technique used is as
follows: deprotect sites that will have the next base (A,C,T, or G)
bound to them using UV light, then bind the next base to those sites
and repeat with a different base. To direct which sites will be
deprotected Affymetrix uses a photolithographic mask which only lets
the UV light activate certain sites.

Using this technology Affymetrix
is able to build up very large arrays of oligonucleotides in
parallel. however due to synthesis efficiencies the longest
oligonucleotide probes that Affymetrix makes are 25 nucleotides
long.
The different types of microarrays each have their own peculiarities and
no one that I know of has published any sort of study rigorously comparing the different
technologies. however there are some inherent strengths and weaknesses to
each technology.
- The Affymetrix chemistry is great, but their technology is very
expensive and fairly inflexible. The basic Fluidics station and
scanner are over $100,000 and then each GeneChip is around $5,000. The
reason that it is inflexible is that if you don't like their arrays
and want to make your own the cost for a new photolithographic mask
can be over a million dollars. That said, the technology is very
robust and Affymetrix's chemistry is reproducible and allows the
detection of SNPs and other small features in the DNA.
One thing to note is that Affymetrix is not currently doing cohybridizations which makes it very
important and challenging to normalize between the experimental and
control GeneChips. That is to say that Affymetrix does not produce
ratios, each probe produces only an absolute intensity.
- Spotting DNA on glass microscope slides is relatively inexpensive
and very flexible. however the spotting process itself is inherently
variable. Also most microarrays produced in this manner use cDNAs as
their probes. Using cDNAs has a couple of techinical problems.
- You need a copy of that DNA to start with so you can use PCR to
produce your probes. This is a major point as we know the sequence of whole
genomes but we don't have unique cDNA libraries that span genomes.
- When using a cDNA as a probe you get a very long sequence to bind
to which makes it impossible to discern between genes that are more
than 80% similar, and forget about detecting SNPs.
It is possible to spot oligos on glass slides and save yourself a lot of
PCR and avoid the above limitations.
The technique that allows the spotting technology to sidestep the
issue of variability in spotting and other concerns is the use of
cohybridizations. This technique is
covered in greater detail later in this document but the main concept
is to use relative RNA expression levels instead of absolute
expression levels. To accomplish this two separate RNA samples are
used: an "experimental" and a "reference". Each RNA is labeled with
a different fluorescent dye, then the two samples are mixed and
hybridized at the same time to the microarray. When the microarray is
scanned, number of photons in the experimental dye's spectrum is
compared to the number of photons in the reference dye's
spectrum. Many variations in spot size, probe concentration and other
issues are cancelled out in this manner.
The basic protocol is as follows:
- Isolate the RNA you are interested in and the RNA from your control. The RNA
can come from any cells. It is important to realize though that the RNA from tissues
or any heterogenous cells may lead to results that reflect changes in the composition of
the sample rather than in changes due to the experimental hypothesis.
- Label the RNA. Usually this means preforming a reverse
transcriptase reaction and incorporating dye that has been linked to a
DNA nucleotide. however some protocols, i.e. Affymetrix's, call for
an amplification of the RNA and labelling of the RNA itself. For
microarrays on nylon membranes usually the label is radioactive.
- hybridize the labeled target to the microarray. This consists of
placing a solution containing the labeled target on the microarry and
letting it sit for a period of hours. This allows a given target to
find it's probe on the microarray and bind to it. Usually this is
carried out a specific temperature to minimize non-specific binding of
target to the probes on the microarray.
- Remove the hybridization solution and wash the microarray. The
washing can be done at different salt and detergent concentrations to
minimize non-specific binding. In general solutions with lower salt
concentrations weaken the DNA base paring and are referred to as "more
stringent" and vice versa for higher salt concentrations.
- Once the microarray has been washed it is time to scan the
microarray. Scanning is just quantitizing how much target bound to
the DNA probe on the microarray. Most microarrays use fluorescent dyes
and are scanned in the following manner:
- laser is used to
excite the fluorescent dye, the photons coming from the dye are
captured using lenses to focus the light and a photo multiplier tube
(PMT) to quanitate how many photons are being captured.
- The resulting number for that section of the microaray is
translated into one pixel of a 16 bit .tiff file. The more pixels per
centimeter, the better the resolution of the resulting .tiff image.
It is important to note that .tiff files are uncompressed and file
formats like .jpeg and .gif which compress data should not be used for
storage of results.
- The resulting image is analyzed by finding the spots and comparing
the differences between chips (if the hybridization contained only one fluor) or
the ratio of the two fluors for cohybridization experiments. how these
differences are normalized, compared and interpreted is beyond the scope or this
document.
For more information on scanning basics check out
Axon's FAQ.
It can be difficult using microarrays to show that there is a relationship between the
absolute intensity of a hybridized probe and the absolute number of target molecules bound.
This is especially true with cDNA probes where it may be difficult to control what
sequences are present in a probe. Issues such as secondary structure, melting temperature, and
even target characteristics make it hard to calibrate an entire array for measuring absolute
molecules bound.
In order to sidestep these issues researchers use cohybridizations to measure
the level of RNA expression
relative to another sample at the same time on the
same microarray. The basic idea is as follows:
- Isolate the RNA from the experimental state and label it with
a fluorescent dye, usually Cy5 or Cy3.
- Isolate RNA from a control state (also called a reference state) and
label it with a different fluorescent dye, usually Cye3 or Cy5, whichever you didn't
use for the experimental RNA.
- Mix the two samples and hybridize them at the same time on the same microarray.
- Wash the microarry to remove non-specific binding and scan the
microarray to quantitate the amount of each fluor, both the control
and the experimental. The key idea is now to not analyze the absolute
intensities, but rather to compare how different the experimental is
from the control. This is usually expressed as a ratio of the two
numbers.
There is currently much debate about how to use and normalize these
ratios, and even exactly which numbers to use for the ratios. The actual
number used to do the ratios comes from the intensities of the pixels that
make up the spot of that DNA probe. Once these pixel intensities are adjusted
for background there are many ways to extract a single number to use in ratios.
Some people use the median pixel value, some use the mean, some researches throw
out all of the saturated pixels and then take the mean. I highly reccomend finding
a good statistics book and familiarizing yourself with basic statistics agian to
understand the significance of these numbers and how they effect your results.
Once you've decided how you want to extract a single value from the
pixel intensities you can now measure your ratio and determine if your
genes of interest are being expressed more or less relative to your
control sample. Some researchers take the log of these ratios as they
will then have the nice property of being centered around zero with
positive numbers indicating induction a gene and negative numbers
indicating repression a gene, relative to the control sample.
Keep in mind that there are still a host of issues when it comes to
normalizing these values and comparing values between different microarrays.
These issues are outside the scope of this document.
In order to understand how hybridizations work check out my
Biology
starter to see the how this technique exploits the marvelous
properties of DNA.
Reverse transcrition uses and enzyme conveniently named Reverse Transcriptase to
produce the complementary DNA strand from an RNA strand. To find out more about the
amazing properties of DNA and RNA check out my
Biology starter
Affymetrix claims that on a routine basis they can detect 1 molecule
in 100,000 and that they can detect two fold changes in RNA expression
levels. In an optimal hybridization they claim to detect one molecule
in 2,000,000 and to detect 10% changes in RNA expression levels. I haven't
seen any published results on spotted arrays but I'd love to hear about it if
someone else has.
Check out other sources of information or
emailme.
The Mguide Build your own and do it yourself.
The Brown Lab FAQ Specific technical questions.
Axon's FAQ. Image scanning quesions.
Check out the
Microarray Links page for other resources.