Pairs for Prostate data.
The original prostate data
contains 55 samples, some cancer tissue, some normal tissue
and some cell lines.
The original data is:
original
(the paper is: August 15, 2001 issue of Cancer Research (61:5974-5978)).
We use only the real tissue samples, thus we
compare the 25 primary tumors to 9 non-neoplastic
tissues (one of the tumors is a duplicate sample which we
treat as an independent sample).
The data we used is here:
data used here
The original data has 12626 Affymetrix probes. We applied
a variation filter to remove the genes that don't vary enough
across the data. This filter applys a lower limit
of 20, an upper limit of 16000, and requires each gene to
vary by greater than 300. This results in 3958 genes
in our pairs data.
Since there are few samples, many pairs are found, 249665
pairs and 52 single genes.
The links below fetch parts of these results, with about
10000 results per page. Each page is large (over 10 megabytes).
They are sorted by "best" first, where "best" means
the largest margin (separation) between the classes.