Princeton
University Invention # 09-2491
Cancer signatures are often confounded when looking at tumor morphology,
but may be inferred through genetic aberration patterns. Array-based methods
provide high-throughput data on genetic copy numbers, but determining the
clinically relevant copy number changes and identifying potentially causative
loci remains a challenge. Conventional statistical and machine learning methods
for linking alterations to clinical outcome ignore critical features. On the
other hand, existing sequence classification methods can only model aggregate
copy number instability, and disregard what happens at genetic loci.
Researchers in the Computer Science Department and the Lewis-Sigler
Institute for Integrative Genomics, Princeton University have developed an
integrated method for jointly classifying tumors, inferring copy numbers, and
identifying clinically relevant positions in recurrent alteration regions from
high-throughput copy number data, such as array CGH.
By capturing sequential as well as local information, this
integrated model, referred to as
¿ Heterogeneous Hidden Conditional Random Field¿ provides
better noise reduction, and achieves more relevant gene retrieval and more
accurate classification than existing methods. This new method notably selects a
small set of candidate genes that can be statistically linked with high
confidence to disease-specific genetic aberration patterns, and provides
unbiased starting points in deciding which genomic regions and which genes to
pursue for further examination.
Experiments on synthetic data and on cancer data show that this method is
superior, in terms of both prediction accuracy and relevant feature discovery,
to existing methods. The utility of Heterogeneous Hidden Conditional
Random Field has been demonstrated by generating novel biological
hypotheses for breast and bladder cancer and melanoma
(see cited reference below).
Heterogeneous Hidden Conditional Random Field has the potential to be
used to discover genes involved in cancer, as a method for developing molecular
diagnosis, or for analysis of patient data for potential molecular-based
targeted treatments, treatment response, and
prognosis.
Princeton is currently seeking industrial collaborators to further
develop and commercialize this technology. Patent protection is
pending.
References:
Barutcuoglu Z., Airoldi E., Dumeaux V., Schapire R., Troyanskaya O.,
Aneuploidy Prediction and Tumor Classification with Heterogenous Hidden
Conditional Random Fields, Bioinformatics Advanced Access published
December 4, 2008.
For more information on Princeton University invention # 09-2491 please
contact:
Laurie Tzodikov
Office of Technology Licensing and Intellectual
Property
Princeton University
4 New South Building
Princeton, NJ 08544-0036
(609) 258-7256
(609) 258-1159 fax
tzodikov@princeton.edu