A TWO-STAGE APPROACH FOR COMBINING GENE EXPRESSION AND MUTATION WITH CLINICAL DATA IMPROVES SURVIVAL PREDICTION IN MYELODYSPLASTIC SYNDROMES AND OVARIAN CANCER

Main Article Content

Yan Li
Xinyan Zhang
Tomi Akinyemiju
Akinyemi Ojesina
Jeff Szychowski
Nianjun Liu
Bo Xu
Nengjun Yi

Abstract

Motivation: Many traditional clinical prognostic factors have been known for cancer for years, but usually provide poor survival prediction. Genomic information is more easily available now which offers opportunities to build more accurate prognostic models. The challenge is how to integrate them to improve survival prediction. The common approach of jointly analyzing all type of covariates directly in one single model may not improve the prediction due to increased model complexity and cannot be easily applied to different datasets.


Results: We proposed a two-stage procedure to better combine different sources of information for survival prediction, and applied the two-stage procedure in two cancer datasets: myelodysplastic syndromes (MDS) and ovarian cancer. Our analysis suggests that the prediction performance of different data types are very different, and combining clinical, gene expression and mutation data using the two-stage procedure improves survival prediction in terms of improved concordance index and reduced prediction error.


Availability and implementation: The two-stage procedure can be implemented in BhGLM package which is freely available at http://www.ssg.uab.edu/bhglm/.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Article Details

How to Cite
Li, Y., Zhang, X., Akinyemiju, T., Ojesina, A., Szychowski, J., Liu, N., Xu, B., & Yi, N. (2016). A TWO-STAGE APPROACH FOR COMBINING GENE EXPRESSION AND MUTATION WITH CLINICAL DATA IMPROVES SURVIVAL PREDICTION IN MYELODYSPLASTIC SYNDROMES AND OVARIAN CANCER. JOURNAL OF BIOINFORMATICS AND GENOMICS, (1 (1). https://doi.org/10.18454/jbg.2016.1.1.2
Section
Bioinformatic tools to interrogate and to model Biological phenomena

References

Hastie, T., Tibshirani, R., & Friedman, J. The elements of statistical learning.

Hastle, T., Tibshirani, R., & Wainwright, M. (2015). Statistical learning with sparsity. Boca Raton: CRC Press.

Hochberg, Y. & Tamhane, A. (1987). Multiple comparison procedures. New York: Wiley.

Iwasaki, M., Liedtke, M., Gentles, A., & Cleary, M. (2015). CD93 Marks a Non-Quiescent Human Leukemia Stem Cell Population and Is Required for Development of MLL-Rearranged Acute Myeloid Leukemia. Cell Stem Cell, 17(4), 412-421. http://dx.doi.org/10.1016/j.stem.2015.08.008

Jacquemet, G., Green, D., Bridgewater, R., von Kriegsheim, A., Humphries, M., Norman, J., & Caswell, P. (2013). RCP-driven α5β1 recycling suppresses Rac and promotes RhoA activity via the RacGAP1–IQGAP1 complex. J Cell Biol, 202(6), 917-935. http://dx.doi.org/10.1083/jcb.201302041

Matsuura, S., Komeno, Y., Stevenson, K., Biggs, J., Lam, K., & Tang, T. et al. (2012). Expression of the runt homology domain of RUNX1 disrupts homeostasis of hematopoietic stem cells and induces progression to myelodysplastic syndrome. Blood, 120(19), 4028-4037. http://dx.doi.org/10.1182/blood-2012-01-404533

Network, T. (2012). Erratum: Integrated genomic analyses of ovarian carcinoma. Nature, 490(7419), 292-292. http://dx.doi.org/10.1038/nature11453

Papaemmanuil, E., Gerstung, M., Malcovati, L., Tauro, S., Gundem, G., & Van Loo, P. et al. (2013). Clinical and biological implications of driver mutations in myelodysplastic syndromes. Blood, 122(22), 3616-3627. http://dx.doi.org/10.1182/blood-2013-08-518886

Houwelingen, J. & Putter, H. (2012). Dynamic prediction in clinical survival analysis. Boca Raton: CRC Press.

Park, T. & Casella, G. (2008). The Bayesian Lasso. Journal Of The American Statistical Association, 103(482), 681-686. http://dx.doi.org/10.1198/016214508000000337

Partheen, K., Levan, K., Österberg, L., & Horvath, G. (2006). Expression analysis of stage III serous ovarian adenocarcinoma distinguishes a sub-group of survivors. European Journal Of Cancer, 42(16), 2846-2854. http://dx.doi.org/10.1016/j.ejca.2006.06.026

Partheen, K., Levan, K., Österberg, L., Claesson, I., Fallenius, G., Sundfeldt, K., & Horvath, G. (2008). Four potential biomarkers as prognostic factors in stage III serous ovarian adenocarcinomas. International Journal Of Cancer, 123(9), 2130-2137. http://dx.doi.org/10.1002/ijc.23758

Paul, N., Allen, J., Chapman, A., Morlan-Mairal, M., Zindy, E., & Jacquemet, G. et al. (2015). α5β1 integrin recycling promotes Arp2/3-independent cancer cell invasion via the formin FHOD3. The Journal Of Experimental Medicine, 212(10), 21210OIA78. http://dx.doi.org/10.1084/jem.21210oia78

Riester, M., Wei, W., Waldron, L., Culhane, A., Trippa, L., & Oliva, E. et al. (2014). Risk Prediction for Late-Stage Ovarian Cancer by Meta-analysis of 1525 Patient Samples. JNCI Journal Of The National Cancer Institute, 106(5), dju048-dju048. http://dx.doi.org/10.1093/jnci/dju048

Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011). Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. Journal Of Statistical Software, 39(5). http://dx.doi.org/10.18637/jss.v039.i05

Steyerberg, E. (2009). Clinical prediction models. New York: Springer.

Tibshirani, R. & Efron, B. (2002). Pre-validation and inference in microarrays. Statistical Applications In Genetics And Molecular Biology, 1(1). http://dx.doi.org/10.2202/1544-6115.1000

Yi, N. & Ma, S. (2012). Hierarchical Shrinkage Priors and Model Fitting for High-dimensional Generalized Linear Models. Statistical Applications In Genetics And Molecular Biology, 11(6). http://dx.doi.org/10.1515/1544-6115.1803

Yi, N. & Xu, S. (2008). Bayesian LASSO for Quantitative Trait Loci Mapping. Genetics, 179(2), 1045-1055. http://dx.doi.org/10.1534/genetics.107.085589

Yuan, Y., Van Allen, E., Omberg, L., Wagle, N., Amin-Mansour, A., & Sokolov, A. et al. (2014). Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nature Biotechnology, 32(7), 644-652. http://dx.doi.org/10.1038/nbt.2940

Zeidan, A., Prebet, T., Saad Aldin, E., & Gore, S. (2014). Risk stratification in myelodysplastic syndromes: is there a role for gene expression profiling?. Expert Review Of Hematology, 7(2), 191-194. http://dx.doi.org/10.1586/17474086.2014.891437

Zhang, J., Liu, X., Datta, A., Govindarajan, K., Tam, W., & Han, J. et al. (2009). RCP is a human breast cancer–promoting gene with Ras-activating function. Journal Of Clinical Investigation. http://dx.doi.org/10.1172/jci37622

Zou, H. & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal Of The Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301-320. http://dx.doi.org/10.1111/j.1467-9868.2005.00503.x