Pairs for Yeoh data set.
The yeoh data is on Affymetrix U95v2 chips. It is a multiclass cancer data set of Pediatric Acute Lymphoblastic Leukemia. This is an excerpt from their supplementry material:
... the data are divided up into the six diagnostic groups (BCR-ABL, E2A-PBX1, Hyperdiploid >50, MLL, T-ALL, TEL-AML1), plus two "other" groups.
This work does not use the "other" groups. It only uses the 6 named groups. It compares each group to the other 5, for example "compare BCR to the rest".
Original data at:
(if you get the original data, beware that the hyperdip50 data has a blank patient sample within the spread sheet) For this work, the data are used RAW, not log transformed. The original data has 12625 genes, which are filtered to remove genes that do not vary enough across the samples. The filter threshold is a variation of at least 10000. This results in 4169 genes, 248 patient samples. The data for these 4169 genes that we use is here: yeoh 4169 genes.
Two splittings have single genes : T (1) and E2A (4). Only the Hyperdiploid >50 splitting yields no pairs.